This is a bilingual snapshot page saved by the user at 2024-5-15 19:46 for https://app.immersivetranslate.com/word/, provided with bilingual support by Immersive Translate. Learn how to save?


Master's degree thesis

题目:


Liquidity, investor structure and


ETF tracking error


Name:

张雪妍


student ID:

2101212393


Department:


HSBC Business School


major:


Master of Finance


research direction:


FinTech


tutor's name:


Liu Boxiao (Associate Professor)


£ Academic Degree £ Professional Degree


May 2024


Copyright Notice


Any organization or individual that stores and keeps various versions of this paper may not lend this paper to others without the consent of the author of this paper, nor may it be copied, copied, photographed, or disseminated in any way. Otherwise, if it causes problems that may affect the author's copyright, he may bear legal liability.

北京大学硕士学位论文

摘要


The main research objects of this article are ETF liquidity, investor structure and the impact of ETF tracking error. Compared with overseas markets, retail investors account for a higher proportion in the Chinese market. As noise traders in the market, the trading behavior of retail investors often leads to fluctuations in investment targets that deviate from their true values. Therefore, this article mainly explores whether ETF liquidity has a greater impact on ETF tracking error when the proportion of retail investors is higher. In order to study the above issues, this article obtained data on listed ETFs from 2013 to 2023, used the difference between the ETF net value return and the ETF secondary market price return as the ETF tracking error, and used the Amihud liquidity index to measure ETF liquidity. The proportion of retail investors holding funds and other control variables are introduced for panel regression analysis. Research results show that, on the one hand, the worse the liquidity of ETF, the greater the tracking error of ETF. This is because the arbitrage activities of Authorized Participants (APs) can reduce ETF tracking errors, but weak liquidity leads to an increase in the transaction costs of APs, which inhibits the arbitrage activities and leads to an increase in ETF tracking errors. On the other hand, this article further discusses the regression results after adding the interaction term between ETF liquidity and retail investor holding ratio, and finds that the original ETF liquidity regression is no longer significant, but the interaction term is significant at the 1% level. Research shows that in the Chinese market, the higher the proportion of retail investors, the greater the impact of ETF liquidity on ETF tracking error.


Keywords: ETF, liquidity, investor structure, tracking error

How Liquidity Affects Exchange-traded Funds Tracking Errors? Evidence from Chinese Market

Xueyan Zhang (Fintech)

Directed by BaiXiao Liu

ABSTRACT

The primary research subject of this paper is the impact of ETF liquidity and investor structure on ETF tracking error. Compared to overseas markets, the proportion of retail investors in the Chinese market is higher. As noise traders in the market, the trading behavior of retail investors often causes unnecessary fluctuations that deviate the investment targets from their true values. Therefore, this paper focuses on whether ETF liquidity has a greater impact on ETF tracking error in cases where the proportion of retail investors is higher. The results show that, on the one hand, less ETF liquidity leads to higher ETF tracking error. The reason for this is that the arbitrage activities of Authorized Participants (APs) can reduce ETF tracking error, while illiquidity increases the trading costs, thus inhibiting arbitrage activities and resulting in larger ETF tracking errors. On the other hand, the paper further discusses the regression results after adding the interaction variable which is ETF illiquidity multiply Individual percentage. It is found that the coefficient of ETF illiquidity is no longer significant, but the interaction variable is significant at the 1% level. It indicates that in the Chinese market, the higher the proportion of retail ownership, the greater the impact of ETF illiquidity on ETF tracking error.

KEY WORDS: ETF, Liquidity, Investor Structure, Tracking Error

10


Chapter name

目录


Chapter 1 Introduction 1


1.1 Topic selection background 1


1.1.1 Passive investing and ETFs 1


1.1.2 Characteristics of China’s stock market 1


1.2 Research content and methods 2


1.3 Research contribution and innovation 3


Chapter 2 Review of related literature 4


2.1 Liquidity 4


2.2 ETF tracking error 4


2.3 Retail investor behavior 5


Chapter 3 Theoretical Analysis and Research Hypotheses 7


Chapter 4 Empirical Research Design 9


4.1 Sample screening and data 9


4.2 Variable selection and calculation 10


4.2.1 Tracking error 10


4.2.2 Liquidity 10


4.2.3 Control variables 11


4.2.4 Instrumental variables 11


4.3 Model design 12


4.3.1 Panel regression model 12


Chapter 5 Empirical Research Results 13


5.1 Descriptive statistics of variables 13


5.2 Main regression results 13


5.3 Robustness check 15


5.3.1 Dependent variable substitution 15


5.3.2 Argument substitution 16


5.3.3 Variable tailing 17


Chapter 6 Cause and Effect Identification Strategy 20


6.1 Instrumental variable regression 20


Chapter 7 Heterogeneity Test 22


7.1 Fund size 22


7.2 Fund trading volume 23


7.3 Holding concentration 24


7.4 Underlying asset trading market 26


Chapter 8 Summary and Reflection 28


Reference 29


Acknowledgments 31


Peking University Dissertation Originality Statement and Use Authorization Instructions 32

III


Chapter 1 Introduction


1.1 Topic selection background


1.1.1 Passive investing and ETFs


The idea of ​​Exchange-Traded Fund (ETF) came from the 1987 U.S. stock market crash. As a method of passive investment, it adopts diversified investment and passive management by copying or tracking a certain market index, so that Transaction costs are minimized to obtain market index returns. This investment method effectively reduces the non-systematic risk in the investment portfolio by investing in a basket of stocks. At the same time, it has gradually become a recent trend due to its more flexible trading mechanism and lower cost compared to ordinary index funds. One of the most popular financial products among investors for decades (Yang Mozhu, 2013). As of the end of February 2024, the total number of global ETFs was 12,063, with a total scale of US$12.3 trillion. Among them, the U.S. ETF market is the largest ETF market, accounting for about 70% of the global scale.


Compared with the development of the global ETF market, my country's ETF market started late. The listing and issuance of China Securities SSE 50 ETF marked the birth of my country's ETF market. The development of ETFs in my country can be divided into three stages: the initial stage from 2005 to 2009, with a small number of ETF products and a small market size; the rapid development stage from 2010 to 2017, with the China Securities Regulatory Commission (CSRC) ) Relaxed restrictions on ETF issuance conditions, the number and scale of ETF products surged, exceeding 100 billion in 2013, becoming the second largest ETF market in Asia; 2018 to the present has been a stage of vigorous development, with the emergence of the bull market in 2018, the number of ETFs and The scale has grown explosively. As of the end of February 2024, the number of listed ETFs in my country has reached 933, with a scale of 2.41 trillion yuan.


With the expansion of the ETF market, the regulatory requirements for the ETF market are becoming more and more perfect, especially the emphasis on monitoring the liquidity quality of ETFs. ETF market makers, as liquidity providers in the ETF market, can provide liquidity to the ETF market through cross-market arbitrage transactions. Therefore, the management of liquidity service providers is constantly being standardized. According to the "Shenzhen Stock Exchange Securities Investment Fund Business Guidelines No. 2 - Liquidity Services" revised in 2023, liquidity services will be provided for fund products. The service providers should be clearly divided and different types of liquidity service providers should be managed differently; the requirements, business rules and management specifications of the main liquidity service providers should be clarified; the relevant information disclosure requirements should also be clarified to ensure the fairness and transparency of ETF market information. sex.


1.1.2 Characteristics of China’s stock market


China's stock market, due to its huge investor group, is growing in scale and social influence, forming an investor structure with its own characteristics. Research shows that factors such as irrational behavior in stock market transactions affect the normal price formation mechanism and market order, which in turn affects investors' ability to obtain reasonable long-term returns from the market. Taken together, the structure and behavior of my country’s investors have the following characteristics:


The number of investors is large, and individual investors account for a high proportion.


China's stock market has a huge investor base, including many small and medium-sized retail investors. In recent years, the absolute number of investors has been rising, with individual investors accounting for the majority of the number of households. At the same time, general individual investors have actively entered the market, and the cumulative number of new natural person investors in the past five years has exceeded 70 million, indicating that my country's residents' property is gradually transforming from savings to investment. The current shareholding ratio of individual investors remains at around 40%, and will increase significantly in 2023.


Individual investors trade frequently, and the main subjects of irrational trading behavior are individual investors.


Judging from the capital turnover rate (the investor's transaction amount is a multiple of the capital invested), the capital turnover rate of individual investors is at a relatively high level, and the frequency of individual transactions is much higher than that of institutions. At the same time, irrational trading behavior is mainly caused by individual investors. Empirical evidence shows that the holders and traders of low-priced stocks, poor-performing stocks, high P/E stocks and ST stocks are mainly small and medium-sized individual investors. Second, individual investment generally shows a strong tendency to speculate, hoping to obtain short-term gains through price fluctuations, and is greatly affected by market sentiment and hot news.


The collective irrational behavior of investors has contributed to stock market fluctuations.


The irrational factors of investors have contributed to the fluctuations of the stock market, and this phenomenon is even more obvious in the Chinese stock market, which is dominated by retail investors. For example, the participation of some investors with newly opened accounts, small asset sizes, lack of trading experience and high risk appetite directly or indirectly contributed to the surge in new stocks on the first day of listing. Their buying behavior firstly raised the stock price; Speculative sentiment; and analysis of some skyrocketing new stocks shows that the proportion of high-risk preference investors, new account investors, and small investors is higher than that of new stocks with normal growth.


Therefore, the A-share market has distinct retail characteristics, which is one of the reasons for the active trading in the A-share market. Compared with mature overseas capital markets, the shareholding ratio of institutional investors in the A-share market is relatively low. This difference allows the A-share market to exhibit unique market characteristics and performance.


Combining the importance of liquidity in the research of investment products, the retail structure and the development status of ETF, this article studies the relationship between ETF tracking error, liquidity and investor structure. This not only helps to better understand the operating mechanism and error sources of the ETF market, but also provides investors with better investment advice and risk management strategies, while providing theoretical and empirical basis for the regulation and investment practice of the ETF market.


1.2 Research content and methods


The research goal of this article is the relationship between ETF tracking error, liquidity and investor structure. Taking ETF funds that have been listed between 2013 and 2023 as research samples, factors such as ETF tracking error and liquidity indicators are calculated respectively, and panel regression analysis is performed. First, regression of ETF tracking error on liquidity and retail investor holding ratio is performed; secondly, based on the above regression results, an interaction term is added, that is, liquidity ◊ retail investor holding ratio, to further study the changes in ETF tracking error in the Chinese market relationship with investor structure and liquidity. Subsequently, the research results will be further grouped and studied, combined with influencing factors, classified according to the fund's share, trading volume, number of holders, and underlying underlying market, to study whether ETF tracking error and liquidity and the relationship between investors are heterogeneous. .


The research methods of this article include:


1. Theoretical analysis method


Based on historical relevant research data and conclusions, conduct a systematic analysis of the impact of liquidity and investor institutions on tracking errors.


2. Empirical analysis method


Using measurement methods, regression analysis was performed on the selected samples and related influencing factors to obtain the influencing relationship between them.


The article is mainly divided into the following parts: Chapter 1 is the introduction, which mainly discusses the background of the topic selection, outlines the research content, and explains the contribution and innovation of the article. Chapter 2 is a review of relevant literature, mainly including theoretical and empirical research on liquidity, retail investment behavior and ETF tracking error. The third chapter is theoretical analysis and research hypotheses. Through the analysis of the arbitrage behavior of authorized participants and the impact of retail investors on market fluctuations, the research hypotheses of the paper are given. Chapter 4 is the empirical research design, including the sources of data, calculation of variables and model design. Chapter 5 is the empirical research results, including descriptive statistics of variables, analysis of main regression results and robustness testing. Chapter 6 is the causal identification strategy, which mainly focuses on possible endogeneity problems in the model and further discusses the causal relationship between independent variables and dependent variables. Instrumental variables are introduced for regression and the results are analyzed. Chapter 7 is the heterogeneity test, which is mainly based on further research on the main regression results and discusses the impact of independent variables on dependent variables under different classifications.


1.3 Research contribution and innovation


1. Previous scholars’ research on liquidity focused more on the study of market liquidity on stock volatility, while this article identifies the subject of research as ETF funds with stock trading models.


2. Compared with past ETF-related research, this article focuses on studying the risk factors affecting the tracking error of the ETF product itself, rather than the impact of the emergence of ETF products on existing investment targets (such as stock pricing efficiency or stock price fluctuations) .


3. The difference from the reference literature is that the main research point of the reference literature is the relationship between liquidity risk and ETF returns, variances and tracking errors, while this article focuses on studying the causes of ETF tracking errors, and based on the characteristics of the market studied, The variable of retail investment structure is introduced to study the relationship with tracking error under the interaction between retail investment and liquidity.


This article mainly refers to the research of Bae and Kim (2020), and uses Chinese ETF data to study and discuss the relationship between its liquidity, investor structure and tracking error.


Chapter 2 Review of related literature


2.1 Liquidity


Research on liquidity has a long history, and scholars at home and abroad have conducted a lot of research in this field. Generally speaking, liquidity can be divided into three levels: 1. Macro liquidity, mainly refers to the money supply; 2. Meso liquidity, mainly refers to the liquidity of financial markets or financial institutions; 3. Micro liquidity, also in this article One of the research objects refers to the ability of financial assets to convert into cash under market conditions.


The research content of liquidity mainly focuses on the relationship between liquidity and asset returns and how systemic liquidity risk is priced in asset returns. For example, Acharya and Pedersen (2005) and Amihud (2002) found that the persistence of liquidity allows it to be used to predict market returns, that is, low liquidity today means there will be low liquidity tomorrow, thus Demand higher earnings to compensate. Amihud and Mendelson (1986) used the bid-ask spread as an indicator of liquidity to study the relationship between liquidity and stock returns and found that stocks with larger bid-ask spreads have higher expected returns. Jones (2002) collected 100 years of bid-ask spread data for sample stocks in the Dow Jones Index and found that the average proportional spread and turnover rate can predict the market's excess returns.


In terms of liquidity risk pricing, Acharya and Pedersen (2005) established a liquidity-adjusted Capital Asset Pricing Model (LCAPM) and found that independent asset returns are strongly correlated with liquidity risk. Pastor and Stambaugh (2003) found that individual stock returns are affected by overall market liquidity. Eckbo and Norli (2002) found that the proportional bid-ask spread of individual stocks has obvious co-movement characteristics, and confirmed that the liquidity risk caused by this co-movement is indeed priced in the cross-sectional portfolio returns. Gibso and Mougeot (2004) also found evidence that liquidity risk is priced in the US market using monthly data.


In terms of ETF liquidity research, Borkovec et al. (2010) and Madhavan (2012) studied the pricing problem of ETFs in the context of the flash crash of US stocks. Borkovec found that a sharp increase in the bid-ask spread will lead to the failure of ETF price discovery. Cespa and Foucault (2014) established a theoretical model to prove that the low liquidity of ETFs will lead to increased uncertainty in their underlying assets, and this uncertainty is reflected in the weakening of the liquidity of their related ETFs. Clifford et al. (2014) found that ETF flows increase with large trading volume, small spreads and high price/NAV ratios. Roncalli and Zheng (2014) measured ETF liquidity and the liquidity of its underlying benchmark index and found that the two were correlated during the day, but there was no obvious relationship within the day.


2.2 ETF tracking error


In terms of measuring ETF tracking error, Haugen and Baker (1990) proposed three measurement indicators: the first is the determination coefficient. The determination coefficient is an indicator of the goodness of fit of the comprehensive measurement model to the sample observations. It also reflects the The degree of influence of multiple independent variables on the dependent variable. The closer the coefficient of determination is to the independent variable, the better it explains the dependent variable. The second is the beta of an index portfolio, which measures the volatility of the index portfolio relative to the overall market. The third is the variance of the return sequence difference between the indexed portfolio and the underlying index. Gastineau (2004) measures the performance of an ETF compared to its benchmark index by analyzing the operating efficiency of the ETF.


When studying the causes of ETF tracking errors, Chiang (1998) pointed out that transaction costs, cash flows, dividends, and changes in underlying index constituents will all affect the size of tracking errors. Manuel Ammann and Heinz Zimmermann (2001) pointed out that different asset allocation strategies lead to different sizes of tracking errors in investment portfolios. Kostovetsky (2003) compared the differences between ETF funds and passive index funds using factors such as investor trading preferences and taxes, and pointed out that the difference between ETFs and benchmark indexes mainly comes from management fees and changes in index constituents. Vardharaj et al. (2004) pointed out that different asset allocation strategies of the underlying index will have different effects on the tracking error of the indexed portfolio. Bae and Kim (2020) studied the US ETF market and found that weakly liquid ETFs have larger tracking errors, and proved the causal link between liquidity and tracking errors through the instrumental variable method.


In terms of studying ETF tracking error arbitrage, Lee and Shleifef (1991) and Pontiff (1996) found that because ETFs can be traded in the primary and secondary markets at the same time, the arbitrage opportunities that exist make the tracking error of ETFs smaller than that of ordinary passive indexes. Funds are smaller. Moussawi and Stahel (2016) used the 2015 flash crash event to study and found that arbitrage behavior can reduce tracking error. Peterffy (2010) reported that the deterioration of arbitrage liquidity will lead to the expansion of ETF tracking errors; Pan and Zeng (2016) also found through event analysis that the loss of arbitrageurs has an inhibitory effect on the reduction of ETF tracking errors.


In terms of domestic research, Chen Jiawei and Tian Yinghua (2005) discussed the causes of ETF tracking errors in response to the phenomenon of ETF discounts and premiums. Liu Wei et al. (2009) analyzed the causes of intraday errors in ETFs through the intraday data of ChinaAMC SSE 50ETF and Huaan SSE 180ETF. Huo Mingyun (2010) analyzed the factors affecting ETF arbitrage costs and the reasons for differences in cash balances. Ma Bin (2010) used ETF portfolios to conduct term arbitrage and found that there are a large number of term arbitrage opportunities in my country. Li Fengyu (2014) explained the phenomenon of ETF discount and premium from the perspective of investor sentiment and found that the relationship between investor sentiment and ETF premium rate in the A-share market is different in different market environments. In a pessimistic market, the two are negatively correlated, while in a neutral or optimistic market Positive correlation in the market.


2.3 Retail investor behavior


For the research on retail investors, the classic theory is the "herding effect". Christie and Huang (1995) explained the herding effect as a phenomenon in which traders ignore their actual asset conditions and blindly follow market trends based on personal judgment. Hwang and Salmon (2004) defined the emergence of herding phenomenon as when investors make investment decisions no longer due to personal thinking and judgment, but to the choices of others or market wind direction, the herding effect has already appeared.


Research on the herding effect mainly focuses on the study of investor sentiment. Rubesam et al. (2021) found that when investor sentiment is overconfident or overly optimistic, investors tend to experience a herding effect. Tang Yong et al. (2020) confirmed through data that the herd effect has a stronger impact when the stock market falls; further, Ma Li (2016) found that due to the rise and fall limits in China's securities market, this limit may be a factor for the stock market. One of the reasons why the herding effect during the rising stage is less than that during the falling stage.


Regarding the investment behavior of retail investors and stock price fluctuations, Friedman (1953) and Fama (1965) believed that the investment behavior of retail investors has no impact on the pricing of financial assets. However, in subsequent research, Cutler et al. (1990) proposed that due to the information bias obtained by individual investors and institutional investors, individual investors are often at an information disadvantage, and most individual investors are often unable to effectively handle their income. Information leads to herding behavior, causing stock prices to deviate from their intrinsic value. In domestic market research, Yan Jiayuan (2015) believes that individual investors often engage in irrational trading behavior due to their limited ability to process information, and this behavior will cause large-scale fluctuations in stock prices through the herding effect.


Chapter 3 Theoretical Analysis and Research Hypotheses


In the absence of arbitrage,
ETF
secondary market price and
ETF
Net worth should remain consistent, otherwise for
APs
There is room for arbitrage. But in actual situations,
ETF
secondary market price and
ETF
Since net worth is affected by different factors, such as transaction rebalancing, product structure, market changes, transaction costs, etc., it is difficult for the two returns to be consistent, so the amount of
ETF
tracking error.
APs
As a participant who can trade in both the primary market and the secondary market,
ETF
The generation of tracking error makes it conduct arbitrage operations (Fig.
3.1
), and the arbitrage operation makes
ETF
The tracking error gradually becomes smaller and tends to
0


Figure 3.1 ETF arbitrage


However, if the liquidity of ETFs is poor, this means that APs will increase transaction costs when conducting arbitrage operations, and the increase in transaction costs will inhibit APs from conducting arbitrage operations, so that the tracking error of ETFs will no longer become smaller. This may also be the reason why liquidity has an impact on ETF tracking error.


On the other hand, retail investors are often regarded as noise traders. Their ability to obtain, process and understand information is limited. In addition, retail investors often engage in irrational trading behavior due to their own limitations. This irrational behavior is contagious to each other and leads to Stock prices often experience large-scale random jump fluctuations (Black, 1986). However, Cutler et al. (1990) proposed that due to the herd effect of retail investors (Bikhchandani et al., 1992), that is, a group of investors with asymmetric information, when previous investor decisions are visible and the action space is discrete, the behavior of early investors It will affect the investment behavior of subsequent investors; this behavior often causes the stock price to deviate from its intrinsic value. Therefore, in funds with concentrated retail investors, the herding behavior of retail investors may often cause the price of the ETF they hold to deviate from its intrinsic value (i.e., net value), further leading to the expansion of tracking errors.


For ETFs, the increase in retail investors will cause ETF prices to fluctuate away from their value. For APs, when retail investors hold more ETFs, their prices are more likely to rise and fall sharply, which makes APs pay more attention to liquidity. Therefore, after the increase of retail investors, the impact of liquidity on tracking error is more obvious.


Based on the above analysis, this article makes the following assumptions:


H1: Retail investors hold more ETFs, and their liquidity has a greater impact on the tracking error of ETFs.


Chapter 4 Empirical Research Design


4.1 Sample screening and data


This article extracts the basic information of listed funds in the CSMAR database and counts the number of ETFs established and listed in the domestic market from 2004 to 2023. In order to ensure that the data sample is sufficient, the period from 2013 to 2023 is selected as the research range. Among them, the selection criteria are:


The fund type is bond or stock;


The investment method is passive;


The fund has been listed on the Shanghai Stock Exchange or Shenzhen Stock Exchange;


Exclude QDII and Hong Kong Stock Connect type funds;


Exclude alternative investment funds;


After screening, the valid data is 842 funds.


See Table 4.1 for specific ETF statistics.


Table 4.1 Statistics of listed ETFs (year)


Newly added every year

累计

2005

1

1

2006

3

4

2007

1

5

2009

2

7

2010

8

15

2011

21

36

2012

8

44

2013

28

72

2014

14

86

2015

18

104

2016

9

113

2017

15

128

2018

27

155

2019

67

222

2020

97

319

2021

247

566

2022

110

676

2023

166

842


When calculating and generating data in this article, in order to avoid the situation where the regression coefficient is too small due to different data magnitudes, the data is transformed into different magnitudes. At the same time, in order to avoid extreme values ​​in the independent variable Amihud liquidity index and dependent variable tracking error , combined with the actual data frequency, perform standardization, normalization and rolling average smoothing operations on both.


4.2 Variable selection and calculation


4.2.1 Tracking error


When calculating ETF tracking error, we choose to use the difference between the daily return rate of the ETF's net value and its secondary market daily return rate as the tracking error indicator of the ETF.


4.2.2 Liquidity


When it comes to selecting ETF liquidity indicators, since ETFs have trading attributes similar to stocks, reviewing the research results related to stocks, there are the following indicators for measuring liquidity:


Bid and offer spread:

S= PA-PB


Among them, PA is the best buying price, and PB is the best selling price.


If the bid-ask price difference is larger, it means the liquidity of the underlying asset is poorer. However, the limitation is that this indicator cannot reflect the impact of price changes caused by large transactions, nor can it reflect the situation of transactions being completed within the spread and outside the spread. At the same time, stocks with high prices generally have larger bid-ask price differences, so it is difficult to compare the liquidity of different stocks.


Amihud Illiquidity Ratio:

Amihud Illiquidity= |ri,t|Vi,t


Among them, ri,t represents the return rate of stock i on day t, and Vi,t represents the trading volume of stock i on day t.


Quoted from the article Illiquidity and stock returns: cross-section and time-series effects published by Amihud in 2002, it shows the range of stock price changes under a certain trading volume. If the stock's liquidity is good, its price change ratio should be smaller under a certain trading volume. Therefore, the smaller the Amihud indicator, the better the stock's liquidity.


Turnover rate:

Turnover=SV×P


Among them, V is the trading volume, S is the total number of outstanding shares, and P is the average trading price of the stock.


The turnover rate is used to measure the average holding time of a stock. The larger the turnover rate, the shorter the average holding time of the stock, the more active the stock trading, and the better the liquidity of the stock.


Taking into account the advantages and disadvantages of each liquidity index, this article chooses the Amihud liquidity index to measure the liquidity of ETFs.


In order to study whether retail investor holdings affect the tracking error of ETFs, this article selects the retail investor holding ratio as a data indicator to measure ETF retail investor holdings.


4.2.3 Control variables


In the selection of control variables, based on the existing research results, the following indicators are selected as the control variables for this regression: circulation share, total transaction volume on the day, benchmark index volatility, number of holders, whether to lend securities, cash ratio, and current subscription ratio , fund replication method.


4.2.4 Instrumental variables


According to the asset portfolio theory, Friedman (1961) proposed that if the money supply decreases, the proportion of money held by investors in their total wealth will decrease, which will indirectly lead to an increase in the marginal utility of cash. In this case, investors tend to sell non-monetary assets (such as stocks, funds, etc.) to increase their actual wealth. On the contrary, when the money supply increases, investors are more likely to reduce their cash holdings and invest their funds in investment and financial products with certain risks. In the research of Michael (1974), he believed that the increase in money supply will cause investors to continuously choose between currency and other valuable assets in order to achieve the optimal combination of holding currency and valuable assets with maximum utility. Therefore, the money supply is related to the funds investors invest in it. However, money supply has no direct impact on the tracking error of ETFs. Therefore, this article chooses money supply M1 as the instrumental variable of this article to avoid possible endogeneity problems in the study.


The specific indicator expression and calculation method are shown in Table 4.2.


Table 4.2 Variable explanation and calculation


variable name

定义

Tracking Error (ETF-NAV %)


The average of the difference between the ETF secondary market transaction price return and the ETF net value return over the rolling 120 trading days.

Tracking Error (ETF-IND %)


The average of the difference between the ETF secondary market transaction price return and the ETF benchmark index return over the rolling 120 trading days.

Tracking Error (NAV-IND %)


The average of the difference between the ETF's net value return and the ETF's benchmark index return over the rolling 120 trading days.

ETF Illiquidity


The rolling 120-trading day average of the Amihud liquidity indicator. Among them, Amihud liquidity index = the absolute value of ETF daily return/ETF trading volume on that day (Amihud, Y 2002). If the trading volume on that day is 0, the Amihud indicator on that day is marked as 0. (mark source)

Individual Percentage


The proportion of ETF retail investors in the current period disclosed in the ETF's regular reports (annual report, semi-annual report).

Interaction


Reciprocal = ETF Illiquidity×Individual Percentage

Shares Outstanding (in ¥billions)


ETF fund circulating shares.

Trading Volume (in ¥billions)


Trading volume × closing price of the day

Expense Ratio (%)


ETF annual management fee.

Index Volatility


The standard deviation of index returns over rolling 120 trading days.

Holder Number (in thousands)


The total number of holders at the end of the ETF reporting period.

Short Selling


Whether ETFs are securities lending, if so, it is 1, otherwise it is 0.

Cash &Deposit Ratio


The cash portion of the portfolio as a percentage of total fund assets is reported regularly. (Including bank deposits and liquidation reserves)

Requisition Ratio


The proportion of the current subscription amount of the fund to the total size.

Replicated


Base index replication method, 1 if it is a complete replication method, 0 otherwise.

Turnover (%)


Fund’s daily trading turnover rate = ((daily trading volume (lots))*100)/circulation turnover share of listed funds as of that day)*100.

M1(in ¥trillions)


The People's Bank of China publishes cash in circulation (M0) + demand deposits every month.

M1×ETF Illiquidity


control variables.


4.3 Model design


4.3.1 Panel regression model


In the regression analysis, since the data structure is multi-dimensional data, this article chooses the panel regression model to study and analyze ETF liquidity, retail investor structure and ETF tracking error. The regression model is established as follows:

Tracking Error= α+β1×ETF Illiquidity+β2×Individual Percentage+Controls+ε (4.1)


In order to further study the impact of the interaction between ETF liquidity and retail investor structure on ETF tracking error, the interaction term is added to the regression model, namely:

Tracking Error= α+β1×ETF Illiquidity+β2×Individual Percentage+β3×Interaction+Controls+ε (4.2)


In order to avoid the impact of missing data on the regression results, when processing the regression data, due to the different data frequencies of different variables, all data frequencies are unified into semi-annual frequency data (fund holder structural data will only be published in the fund annual report and semi-annual report) ). After deleting missing benchmark index data, missing fund net value data, etc., 3249 observed data were finally selected as regression data.


Chapter 5 Empirical Research Results


5.1 Descriptive statistics of variables


Table 5.1 shows the descriptive statistical results of the main variables used in this study. The indicators include mean, standard deviation, maximum value, minimum value, etc.


Table 5.1 Descriptive statistics of variables

Variables

N

Mean

SD

P1

P50

P99

Tracking Error (ETF-NAV %)

3,249

0.31

0.29

0.07

0.19

1.41

Tracking Error (ETF-IND %)

3,249

0.31

0.27

0.07

0.20

1.29

Tracking Error (NAV-IND %)

3,249

0.05

0.09

0.01

0.03

0.46

ETF Illiquidity

3,249

2.88e-3

0.03

4.68e-8

6.24e-6

0.04

Individual Percentage

3,249

0.46

0.32

2.80e-3

0.43

1.00

Interaction

3,249

4.86e-4

4.84e-3

1.03e-8

2.42e-6

0.01

Shares Outstanding (in ¥billions)

3,249

1.38

3.55

3.00e-3

0.21

19.51

Trading Volume (in ¥billions)

3,249

0.07

0.27

0.00

6.63e-3

1.43

Expense Ratio (%)

3,249

0.44

0.13

0.15

0.50

0.60

Index Volatility

3,249

1.18

15.57

1.01e-3

0.01

0.03

Holder Number (in thousands)

3,249

2.07e-3

5.98e-3

1.61e-05

3.28e-4

0.03

Short Selling

3,249

0.12

0.33

0.00

0.00

1.00

Cash &Deposit Ratio

3,249

0.02

0.04

1.50e-3

0.02

0.09

Requisition Ratio

3,249

0.48

1.24

0.00

0.19

5.25

Replicated

3,249

0.97

0.16

0.00

1.00

1.00

Turnover (%)

3,249

0.06

0.48

0.00

0.02

0.40

M1(in ¥trillions)

3,249

64.54

4.434

54.39

64.74

69.56

M1×ETF Illiquidity

3,249

0.02

0.17

2.88e-07

4.03e-05

0.25


5.2 Main regression results


Carry out the regression of Formulas 4.1 and 4.2 respectively, controlling time and individual (fund) effects, and the results are shown in Table 5.2. Column (1) is the regression result without adding interaction terms. The results show that the regression coefficient of the ETF liquidity indicator on the ETF tracking error is significant, indicating that the ETF liquidity indicator has an impact on the ETF tracking error; the regression coefficient of the ETF liquidity indicator is 0.945, indicating that when the ETF liquidity changes by 1%, other conditions do not change. If the ETF liquidity changes, the tracking error of the ETF will change by about 0.95%; in addition, since the calculation formula of the ETF liquidity indicator is the daily yield/the trading volume of the day, if the liquidity of the ETF is strong, it means that the price fluctuation of the ETF with a certain trading volume The impact is not significant, so the ETF liquidity indicator should be smaller. Combined with the positive regression coefficient, the regression results indicate that the worse the liquidity of the ETF, the greater the tracking error.


Column (2) is the result after adding the interaction term. The results show that when the interaction term of liquidity The tracking error regression coefficient is significant. The regression coefficient of the interaction term is 7.199, indicating that the interaction term has a positive impact on ETF tracking error.


The above results show that in the Chinese market, on the one hand, ETFs with weak liquidity have higher ETF tracking errors; on the other hand, because domestic investors are mainly retail investors, the relationship between ETF tracking errors and their liquidity is less stable than those held by retail investors. The effect of relationship is more obvious on more ETFs.


To sum up, in the Chinese ETF market, the proportion of retail investors' holdings promotes the impact of ETF illiquidity on ETF tracking error.


In addition, based on the two regression results, the fluctuation of the benchmark index, ETF transaction amount, ETF cash holding ratio, and ETF subscription ratio all have an impact on ETF tracking error.


Table 5.2 Panel regression results

Dependent Variable

Tracking Error (ETF_NAV %)

(1)

(2)

ETF Illiquidity

0..945***

0.154

(0.25)

(0.29)

Individual Percentage

0.033

0.027

(0.06)

(0.06)

Interaction

7.199***

(2.65)

Shares Outstanding (in ¥billions)

0.001

0.001

(0.00)

(0.00)

Trading Volume (in ¥billions)

0.120*

0.119*

(0.06)

(0.06)

Expense Ratio (%)

0.031

0.026

(0.42)

(0.41)

Index Volatility

-0.001***

-0.001***

(0.00)

(0.00)

Holder Number (in thousands)

-0.350

-0.286

(1.81)

(1.81)

Short Selling

0.007

0.006

(0.02)

(0.02)

Cash &Deposit Ratio

-0.264**

-0.506***

(0.11)

(0.19)

Requisition Ratio

0.039**

0.040**

(0.02)

(0.02)

Replicated

-0.027

-0.029

(0.02)

(0.02)

Year-month Fixed Effects

YES

YES

Fund Fixed Effects

YES

YES

Nobs.

3213

3213

Adjusted R2

0.588

0.591


1. ***, **, and * indicate significance at the 1%, 5%, and 10% levels respectively. The robust standard errors after clustering are in parentheses. The explained variables in each column are ETF tracking errors. 2. Column (1) represents the regression results without adding interaction terms, and column (2) represents the regression results after adding interaction terms. 3. Columns (1) and (2) both control date and fund fixed effects. 4. The definition of relevant variables is as shown in Table 4.2 of this article.


5.3 Robustness check


In order to prevent the research results from being affected by other factors, this article conducts the following three robustness tests to ensure the reliability and stability of the research results.


5.3.1 Dependent variable substitution


Panel regression is performed using the rolling 120-day average of the difference between the ETF's net value return and its benchmark return and the rolling 120-day average of the difference between the ETF's secondary market price return and its benchmark return as alternative cause variables. The regression results are shown in Table 5.3.


From Table 5.3, we can see that the regression coefficients of the interaction term on the replaced dependent variable are significant at the 10% and 5% levels respectively, indicating that the research results are robust.


Table 5.3 Robustness test regression results (dependent variable replacement)

Dependent Variable

Tracking Error (NAV-IND %)

Tracking Error (ETF-IND %)

(1)

(2)

ETF Illiquidity

-0.049**

0.221

(0.03)

(0.27)

Individual Percentage

-0.002

0.025

(0.01)

(0.05)

Interaction

0.561*

5.886**

(0.32)

(2.87)

Shares Outstanding (in ¥billions)

0.000

-0.001

(0.00)

(0.00)

Trading Volume (in ¥billions)

-0.007

0.071**

(0.01)

(0.03)

Expense Ratio (%)

-0.025

0.004

(0.02)

(0.38)

Index Volatility

0.000

-0.001***

(0.00)

(0.00)

Holder Number (in thousands)

-0.253

1.268

(0.43)

(1.22)

Short Selling

0.005

0.022

(0.00)

(0.01)

Cash &Deposit Ratio

-0.128***

-0.193

(0.03)

(0.24)

Requisition Ratio

-0.009***

0.021**

(0.00)

(0.01)

Replicated

0.005**

-0.018

(0.00)

(0.02)

Year-month Fixed Effects

YES

YES

Fund Fixed Effects

YES

YES

Nobs.

3213

3213

Adjusted R2

0.610

0.693


1. ***, **, and * indicate significance at the 1%, 5%, and 10% levels respectively, and the robust standard errors after clustering are in parentheses. 2. The result of column (1) is that the dependent variable is the difference between the ETF's net return and its benchmark return, and the result of column (2) is the difference between the ETF's price return and its benchmark return. 3. Columns (1) and (2) both control date and fund fixed effects.


5.3.2 Argument substitution


Since ETFs have secondary market trading characteristics similar to stocks, stock-type liquidity indicators can also be used to measure ETF liquidity. Referring to the discussion on liquidity measurement indicators in Chapter 4, this article uses ETF turnover rate to replace the Amihud liquidity indicator and conducts a robustness test on the regression results. The results are shown in Table 5.4.


It can be seen from Table 5.4 that after replacing the liquidity indicator, the regression coefficient of the new interaction term on the tracking error is still significant at the 5% level, and the direction of the regression coefficient is negative, consistent with the main regression results. This shows that the regression results are robust.


Table 5.4 Robustness test regression results (independent variable replacement)

Dependent Variable

Tracking Error (ETF-NAV %)

(1)

Turnover (%)

-0.761*

(0.44)

Individual Percentage

0.084

(0.07)

Turnover (%)×Individual Percentage

-1.453**

(0.72)

Shares Outstanding (in ¥billions)

0.001

(0.00)

Trading Volume (in ¥billions)

0.116*

(0.06)

Expense Ratio (%)

0.021

(0.42)

Index Volatility

-0.001***

(0.00)

Holder Number (in thousands)

-0.136

(1.73)

Short Selling

0.007

(0.02)

Cash &Deposit Ratio

-0.196

(0.17)

Requisition Ratio

0.039**

(0.02)

Replicated

0.066

(0.07)

Year-month Fixed Effects

YES

Fund Fixed Effects

YES

Nobs.

3213

Adjusted R2

0.586


***, **, and * indicate significance at the 1%, 5%, and 10% levels respectively, and the robust standard errors after clustering are in parentheses.


5.3.3 Variable tailing


The independent variable and dependent variable are Winsoriz-winsorized by 1% above and below respectively, that is, the 1% and 99% quantile data are used to replace the data outside the 1% and 99% quantile. The regression results are shown in Table 5.5.


As shown in Table 5.5, column (1) is only the dependent variable, column (2) is only the independent variable (ETF Illiquidity, Individual Percentage), and column (3) is the independent variable at the same time. , the dependent variable is winsorized. According to the table, regardless of the dependent variable or the independent variable, the regression results of the interaction term are significant, and the direction of the coefficient is consistent with the main regression result, indicating that the research results are robust.


Table 5.5 Robustness test regression results (variables winnowed)

Dependent Variable

Tracking Error

(ETF-NAV_w %)

Tracking Error

(ETF-NAV %)

Tracking Error

(ETF-NAV_w %)

(1)

(2)

(3)

ETF Illiquidity

0.152

13.315**

11.770***

(0.26)

(2.41)

(2.06)

Individual Percentage

0.038

0.052

0.059

(0.05)

(0.05)

(0.04)

Interaction

6.774***

12.562**

13.092***

(2.44)

(6.28)

(4.69)

Shares Outstanding (in ¥billions)

0.000

0.002

0.001

(0.00)

(0.00)

(0.00)

Trading Volume (in ¥billions)

0.066**

0.104*

0.052*

(0.03)

(0.06)

(0.03)

Expense Ratio (%)

0.000

0.092

0.057

(0.35)

(0.35)

(0.30)

Index Volatility

-0.001***

-0.001***

-0.001***

(0.00)

(0.00)

(0.00)

Holder Number (in thousands)

0.474

-1.414

-0.543

(1.31)

(1.85)

(1.32)

Short Selling

0.019

-0.004

0.010

(0.01)

(0.02)

(0.01)

Cash &Deposit Ratio

-0.452**

-0.232*

-0.198

(0.18)

(0.13)

(0.13)

Requisition Ratio

0.028**

0.040**

0.029**

(0.01)

(0.02)

(0.01)

Replicated

-0.044*

-0.020

-0.036**

(0.03)

(0.02)

(0.01)

Year-month Fixed Effects

YES

YES

YES

Fund Fixed Effects

YES

YES

YES

Nobs.

3213

3213

3213

Adjusted R2

0.639

0.625

0.671


1.***, **, and * indicate significance at the 1%, 5%, and 10% levels respectively, and the robust standard errors after clustering are in parentheses. 2. Column (1) is the regression result after shrinking the dependent variable by 1% before and after, and keeping the other variables unchanged; Column (2) is the regression result after shrinking the independent variables ETF Illiquidity and Individual Percentage by 1% before and after, keeping the other variables unchanged. The following is a pair of regression results; column (3) is the regression result when the independent variables and dependent variables are shrunk by 1% and the remaining variables remain unchanged.


Cause and Effect Identification Strategy


6.1 Instrumental variable regression


In the empirical research results of Chapter 5, it is found that the tracking error of ETFs with poor liquidity is larger; at the same time, this effect will be strengthened on ETFs with a higher proportion of retail investors. In order to further establish the causal relationship between independent variables and dependent variables, and to solve potential endogeneity problems, including omitted variables, sample selection, two-way causality and measurement error, this chapter uses the instrumental variable method for further verification.


Based on Friedman's (1961) research, he believed that reducing the supply of money will lead to a smaller share of the money held by investors in total wealth, which indirectly leads to an increase in the marginal utility of cash, and investors will sell non-conventional products such as stocks and cash. Monetary assets to increase their real wealth; on the contrary, when the money supply increases, investors are more likely to reduce their cash holdings and purchase investment and financial products with certain risks. Therefore, this article chooses the indicator reflecting the money supply: M1 as the instrumental variable of this article.


After adding instrumental variables, the regression results are shown in Table 6.1. After using the instrumental variable method, the regression coefficient of the independent variable (interaction term) on the tracking error is still significant, which shows that through the test of instrumental variables, potential endogeneity problems can be eliminated.


Table 6.1 Instrumental variable regression

Dependent Variable

Individual Percentage

Interaction

Tracking Error (ETF-NAV %)

(1)

(2)

(3)

M1(in ¥trillions)

0.011***

-0.000***

(0.00)

(0.00)

M1×ETF Illiquidity

-2.263***

0.056***

(0.61)

(0.01)

ETF Illiquidity

12.110***

-0.205***

5.766**

(3.45)

(0.04)

(2.76)

Individual Percentage

1.564***

(0.24)

Interaction

44.704*

(25.12)

Shares Outstanding (in ¥billions)

-0.0276

0.000

-0.048***

(0.00)

(0.00)

(0.01)

Trading Volume (in ¥billions)

-0.068*

-0.000*

-0.044

(0.03)

(0.00)

(0.05)

Expense Ratio (%)

0.290

0.001

0.660

(0.04)

(0.00)

(0.11)

Index Volatility

-0.001**

-0.000

-0.004***

(0.00)

(0.00)

(0.00)

Holder Number (in thousands)

0.202***

-0.001

0.287

(0.14)

(0.02)

(0.06)

Short Selling

-0.128

0.000

-0.294

(0.02)

(0.00)

(0.04)

Cash &Deposit Ratio

0.958

0.019***

2.815***

(0.16)

(0.00)

(0.81)

Requisition Ratio

-0.005***

-0.000

0.009***

(0.00)

(0.00)

(0.01)

Replicated

0.181*

0.001***

0.505***

(0.03)

(0.00)

(0.09)

Year-month Fixed Effects

YES

YES

YES

Fund Fixed Effects

YES

YES

YES

F statistics

35.72

32.99

Nobs.

3236

3236

3236

Adjusted R2

0.138

0.577

0.646


1.***, **, and * indicate significance at the 1%, 5%, and 10% levels respectively, and the robust standard errors after clustering are in parentheses. 2. Column (1) and column (2) are the results of the first stage of instrumental variable regression, and the result of column (3) is the regression result of the independent variable on the dependent variable after fitting the regression.


Chapter 7 Heterogeneity Test


In order to further study the relationship between independent variables and dependent variables, this chapter conducts heterogeneity testing and classifies the data according to different classification standards to study the relationship between the two under different data sets. Combining the main regression results and factors affecting ETF tracking error, the following inferences are made: fund size, subscription and redemption ratio, cash ratio in the asset portfolio, and underlying underlying trading market conditions will affect the tracking error of ETF.


7.1 Fund size


For small-scale funds, if their holders are more retail investors, because retail investors hold the underlying assets for a shorter period than institutional investors (Chang Jiang, 2008), at the same time, Hsieh et al. (2020) found The herding effect of retail investors is stronger in small-cap stocks. Similarly, among smaller ETF funds, the impact of retail holdings on ETF liquidity is stronger on its tracking error. Therefore, based on the above analysis, this article puts forward the following hypotheses:


H2: For funds with small share sizes, high retail investor holdings cause ETF liquidity to have a greater impact on ETF tracking error.


In order to verify this hypothesis, this article classifies the sample ETF funds according to their share size, and divides funds with shares greater than 200 million into large share funds, and vice versa as small share funds. Regression analysis was performed on the two funds respectively, and the results are shown in Table 7.1.


Table 7.1 Heterogeneity test results 1

Dependent Variable

Tracking Error (ETF-NAV %)

(1)

(2)

ETF Illiquidity

23.684***

0.102

(7.71)

(0.27)

Individual Percentage

-0.160**

0.055

(0.07)

(0.08)

Interaction

8.743

5.824***

(8.12)

(2.04)

Shares Outstanding (in ¥billions)

0.002

-0.122

(0.00)

(0.36)

Trading Volume (in ¥billions)

0.062

-0.438

(0.04)

(0.59)

Expense Ratio (%)

-0.110

1.127**

(0.16)

(0.48)

Index Volatility

-0.000***

-0.002***

(0.00)

(0.00)

Holder Number (in thousands)

-1.152

-36.851***

(1.16)

(9.88)

Short Selling

-0.011

-0.048

(0.02)

(0.06)

Cash &Deposit Ratio

-0.455

-0.408***

(0.37)

(0.13)

Requisition Ratio

0.079***

0.015

(0.02)

(0.01)

Replicated

0.000

-0.040

(.)

(0.06)

Year-month Fixed Effects

YES

YES

Fund Fixed Effects

YES

YES

Nobs.

1617

1552

Adjusted R2

0.426

0.692


1. ***, **, and * indicate significance at the 1%, 5%, and 10% levels respectively, and the robust standard errors after clustering are in parentheses. 2. The result of column (1) is large-share funds, and the result of column (2) is small-share funds.


7.2 Fund trading volume


According to the main regression results, it can be seen that the worse the liquidity of ETF funds, the greater the tracking error. At the same time, a fund with larger trading volume indicates that its trading is more active, that is, its liquidity is better. Therefore, based on the above analysis, the following hypotheses are put forward:


H3: The smaller the trading volume of a fund, the greater its tracking error will be affected by the proportion of retail investors.


In order to verify this hypothesis, this article classifies the sample ETF funds according to their daily trading volume, and classifies funds with a daily trading volume greater than 10 million RMB as funds with good liquidity, and vice versa as funds with poor liquidity. Regression analysis was performed on the two funds respectively, and the results are shown in Table 7.2.


Table 7.2 Heterogeneity test results 2

Dependent Variable

Tracking Error (ETF-NAV %)

(1)

(2)

ETF Illiquidity

-433.679

0.183

(786.43)

(0.26)

Individual Percentage

-0.065

0.042

(0.07)

(0.07)

Interaction

6202.473

5.961***

(6714.62)

(2.21)

Shares Outstanding (in ¥billions)

0.003

-0.006

(0.00)

(0.01)

Trading Volume (in ¥billions)

0.052

-6.078**

(0.04)

(3.08)

Expense Ratio (%)

-0.226***

-0.649***

(0.07)

(0.25)

Index Volatility

-0.001***

-0.001***

(0.00)

(0.00)

Holder Number (in thousands)

-2.179*

-30.846***

(1.27)

(9.99)

Short Selling

-0.017

0.093***

(0.02)

(0.02)

Cash &Deposit Ratio

-0.300

-0.401***

(0.38)

(0.14)

Requisition Ratio

0.092***

0.018

(0.01)

(0.01)

Replicated

0.000

0.000

(.)

(.)

Year-month Fixed Effects

YES

YES

Fund Fixed Effects

YES

YES

Nobs.

1306

1774

Adjusted R2

0.301

0.669


1. ***, **, and * indicate significance at the 1%, 5%, and 10% levels respectively, and the robust standard errors after clustering are in parentheses. 2. The result of column (1) is a fund with good liquidity, and the result of column (2) is a fund with poor liquidity.


7.3 Concentration of holdings


It is known that retail investor holdings will amplify the impact of ETF liquidity on ETF tracking error. So, does the total number of fund holders have anything to do with the above impact? In this regard, this article puts forward the following hypotheses:


H4: The more dispersed the fund holding structure is, the smaller the effect of retail holdings on amplifying ETF liquidity on ETF tracking error.


This article refers to funds with more than 10,000 holders as holding diversified funds, and vice versa as holding concentrated funds. The classification regression results are shown in Table 7.3.


Table 7.3 Heterogeneity test results 3

Dependent Variable

Tracking Error (ETF-NAV %)

(1)

(2)

ETF Illiquidity

1070.789***

0.170

(293.86)

(0.26)

Individual Percentage

-0.131*

0.080

(0.07)

(0.06)

Interaction

1835.086

6.231***

(1955.36)

(2.25)

Shares Outstanding (in ¥billions)

0.001

0.002

(0.00)

(0.01)

Trading Volume (in ¥billions)

0.019

0.299**

(0.03)

(0.14)

Expense Ratio (%)

-0.145*

-0.077

(0.08)

(0.44)

Index Volatility

12.077***

-0.001***

(3.44)

(0.00)

Holder Number (in thousands)

-1.082

-166.315***

(0.97)

(46.82)

Short Selling

-0.017

0.027

(0.02)

(0.03)

Cash &Deposit Ratio

-1.717**

-0.426***

(0.82)

(0.14)

Requisition Ratio

0.126***

0.015

(0.02)

(0.01)

Replicated

0.000

-0.057**

(.)

(0.03)

Year-month Fixed Effects

YES

YES

Fund Fixed Effects

YES

YES

Nobs.

886

2281

Adjusted R2

0.356

0.674


1. ***, **, and * indicate significance at the 1%, 5%, and 10% levels respectively, and the robust standard errors after clustering are in parentheses. 2. Column (1) results in holding diversified funds, and column (2) results in holding concentrated funds.


7.4 Underlying asset trading market


At present, the domestic trading markets include the Shanghai Stock Exchange, Shenzhen Stock Exchange and Beijing Stock Exchange. Due to the different trading activity of different exchanges, the liquidity of cross-market funds and single-market funds is different. Combined with the main regression results, the following hypotheses are put forward:


H4: The interaction term between the proportion of retail investors in cross-market funds and ETF liquidity has a greater impact on ETF tracking error.


Based on the above assumptions, if the fund's underlying asset trading market is greater than 1, the fund is classified as a cross-market fund, otherwise it is a single-market fund. The regression results are shown in Table 7.4.


Table 7.4 Heterogeneity test results 4

Dependent Variable

Tracking Error (ETF-NAV %)

(1)

(2)

ETF Illiquidity

3.521*

0.117

(1.85)

(0.32)

Individual Percentage

0.050

-0.067

(0.04)

(0.13)

Interaction

26.469***

5.402

(8.84)

(3.52)

Shares Outstanding (in ¥billions)

0.004

-0.001

(0.00)

(0.00)

Trading Volume (in ¥billions)

0.117*

0.081*

(0.07)

(0.04)

Expense Ratio (%)

0.487

-0.332***

(0.66)

(0.11)

Index Volatility

-0.001***

-0.001***

(0.00)

(0.00)

Holder Number (in thousands)

-2.940

0.771

(2.96)

(1.28)

Short Selling

-0.001

0.027

(0.02)

(0.04)

Cash &Deposit Ratio

-0.389***

-0.558

(0.13)

(0.44)

Requisition Ratio

0.053***

0.029

(0.02)

(0.02)

Replicated

0.000

0.000

(.)

(.)

Year-month Fixed Effects

YES

YES

Fund Fixed Effects

YES

YES

Nobs.

2381

831

Adjusted R2

0.578

0.624


1.***, **, and * indicate significance at the 1%, 5%, and 10% levels respectively, and the robust standard errors after clustering are in parentheses. 2. The result in column (1) is a cross-market fund, and the result in column (2) is a single-market fund.


Chapter 8 Summary and Reflection


This article delves into the relationship between retail holdings, ETF (exchange-traded fund) liquidity, and ETF tracking error. The study found two important conclusions: First, there is a positive relationship between the liquidity of ETFs and their tracking errors, that is, ETFs with less liquidity tend to exhibit larger tracking errors. Secondly, when retail investors hold more ETFs, the ETFs they hold have a more significant impact on the relationship between liquidity and tracking error.


Overall, this research is of great significance for understanding the operating mechanism of the ETF market and the impact of retail investment behavior. Compared with the U.S. ETF market, China's ETF market started late. Tracking error is one of the important angles to measure ETF risks. Carrying out relevant research will be helpful and meaningful to the supervision and product innovation of domestic ETFs.


For retail investors, this research sheds light on the risks they may face when holding multiple ETFs, especially among less liquid ETFs. Understanding and recognizing this correlation can help retail investors choose their investment portfolios more carefully, thereby reducing investment risks and increasing investment returns.


For ETF issuers and market regulators, the findings highlight the importance of liquidity management. Especially in ETFs with poor liquidity, measures should be taken to improve market liquidity to reduce ETF tracking errors and maintain the stability and healthy development of the market. In addition, for regulatory agencies, understanding the phenomenon that liquidity has a more significant impact on tracking error when retail investors hold multiple ETFs will help to formulate relevant policies and regulations more accurately to protect the rights and interests of retail investors and the market. of stability.


Future research directions can further explore the following aspects: First, consider expanding the research sample to cover more different types of ETFs and retail investors to verify the universality of the research results. Secondly, other factors that may affect ETF liquidity and tracking errors, such as market volatility, transaction costs, etc., can be further explored to build a more comprehensive model. In addition, the future development trends of the ETF market can also be discussed, especially the possible changes in the operating model of ETFs and investor behavior under the influence of emerging fields such as digital finance and intelligent investment.


The findings of this paper provide investors, regulatory agencies and academia with clues to an in-depth understanding of the ETF market. It also enriches the research directions and conclusions of ETF-related research. It is hoped that it can help provide certain reference and reference for future related research and market supervision. Enlightenment


Chapter name


references


Chang Jiang, 2008. Analysis of retail investment characteristics and protection suggestions. Management Observation, 2: 172-173.


Chen Jiawei, Tian Yinghua, 2003. Investment risk analysis based on the phenomenon of ETF discount and premium. Statistics and Decision Making, 3: 92-94.


Huo Mingyun, 2010. Research on factors affecting arbitrage costs of ETF funds. Financial Economics, 2: 77-79.


Li Fengyu, 2014. Can investor sentiment explain the discount and premium of ETFs? ——Empirical evidence from the A-share market. Financial Research, 2: 180-192.


Liu Wei, Chen Min, Liang Bin, 2009. ETF arbitrage analysis based on financial high-frequency data. Chinese Scientific Management, 4: 1-7.


Ma Bin, 2010. Research on stock index futures arbitrage based on ETF. Statistics and Decision-making, 7: 3.


Tang Yong, Hong Xiaomei, Zhu Pengfei. 2020. Limited attention and abnormal characteristics of the stock market and herding effect. Financial Theory and Practice, 1: 11-20.


Yan Jiayuan, 2015. The impact of individual investor behavior on the stock market. Business, 44: 1.


Yang Mozhu, 2013. ETF capital flows, market returns and investor sentiment—empirical evidence from the A-share market. Financial Research, 4: 156-169.


Ma Li, 2016. Empirical analysis of herding effect in China's stock market. Nankai Economic Research, 1: 144-153.

Amihud, Y., 2002. Illiquidity and stock returns: cross-section and time-series effects. Journal of Financial Markets, 5 (1), 35-56

Amihud, Y., Mendelson, H., 1986. Asset Pricing and the Bid-Ask Spread. Journal of Financial Economics, 17, 223-249.

Ammanm, M., Zimmermann, H., 2001. Tracking error and tactical asset allocation. Financial Analyst Journal, 57 (2).

Archarya, V.V., Pedersen, L.H., 2005. Asset pricing with liquidity risk. Journal of Financial Economics, 77 (2), 375-410.

Bae, K., Kim, D., 2020. Liquidity risk and exchange-traded fund returns, variances, and tracking errors. Journal of Financial Economics, 138 (1), 222-253.

Bikhchandani, S., Hirshleifer, D., Welch, I., 1992. A theory of fads, fashion, custom, and cultural change as informational cascades. Journal of Political Economy, 100 (5), 992-1026.

Black, F., 1986. Noise. Journal of Finance, 41(3), 528-543.

Borkovec, M., Domowitz, I., Serbin, V., Yegerman, H., 2010. Liquidity and price discovery in exchange-traded funds: one of several possible lessons from the Flash Crash. Journal of Banking and Finance, 27 (9), 1667-1703.

Cespa, G., Foucault, T., 2014. Illiquidity contagion and liquidity crashes. Review of Financial Studies, 27 (6), 1615-1660.

Clliford, C. P., Fulkerson, J. A., Jordan, B. D., 2014. What drives ETF flows? Financial Review, 49 (3), 619-642.

Chiang, W., 1998. Optimizing Performance. Indexing for Maximum Investment Results, GPCo Publishers.

Cutler, D., Poterba, J., Summers, L., 1990. Speculative Dynamics and the Role of Feedback Traders.

Eckbo, Espen B., Norli, O., 2002. Pervasive Liquidity Risk. Working Paper, Dartmouth College.

Fama, E. F., 1965. The behavior of stock-market prices. Journal of Business, 38 (1), 34-105.

Friedman, M., 1966. Essays in Positive Economics. University of Chicago Press.

Gibson, R., Mougeot, N., 2004. The Pricing of Systematic Liquidity Risk: Empirical Evidence from the US Stock Market. Journal of Banking and Finance, 28, 157-178.

Haugen, R. A., Baker, N. L., 1990. Dedicated stock portfolios. Journal of Portfolio Management, 16 (4), 17-22.

Hodges, S. D., 1976. Problems in the application of portfolio selection models. Omega, 4 (6), 699-709.

Hsieh, S. F., Chan, C. Y., Wang, M. C., 2020. Retail investor attention and herding behavior. Journal of Empirical Finance, 59, 109-132.

Hwang, S., Rubesam, A., Salmon, M., 2021. Beta herding through overconfidence: a behavioral explanation of low-beta anomaly. Journal of International Money and Finance, 111.

Jon, A, Fulkerson, et al. 2014. What drives ETF flows? Financial Review, 49 (3), 619-642.

Jones. C. M., Shi, D., Zhang, X., Zhang, X., 2020. Heterogeneity in retail investors: evidence from comprehensive account-level trading and holdings data. SSRN Electronic Journal.

Kostovetsky, l., 2003. Index mutual funds and exchange-traded funds. Journal of Portfolio Management, 29 (4), 80-92.

Lee, C., Shleifer, A., Thaler, R. H., 1991. Investor sentiment and the closed-end fund puzzle. Journal of Finance, 46(1).

Lee, I. H., 1998. Market crashes and informational avalanches. Review of Economic Studies, 65 (4), 741-759.

Madhavan, A., 2012. Exchange-traded funds, market structure, and the Flash Crash. Financial Analysts Journal, 68 (4), 20-35.

Madhavan, A., 2014. Exchange-traded funds: an overview of institutions, trading, and impacts. Annual Review of Financial Economics, 6 (1), 311-341.

Nguyen ,V., Phengpis, C., 2009. An analysis of the opening mechanisms of exchange traded fund markets. The Quarterly Review of Economics and Finance, 49 (2), 562-577.

Pan, K., Yao, Z., 2016. ETF Arbitrage under liquidity mismatch. SSRN Electronic Journal.

Pastor, L., Stambaugh, R., 2003. Liquidity Risk and Expected Stock Returns. Journal of Political Economy, 111, 642-685.

Pontiff, Jeffrey, 1996. Costly arbitrage: evidence from closed-end funds. Journal of Economics, 111 (4),1135-1151.

Roncalli, T., Zheng, B., 2014. Measuring the Liquidity of ETFs: An Application to the European Market. SSRN Electronic Journal.

Vardharaj, R., Fabozzi, F. J., Jones, F. J., 2004. Determinants of tracking error for equity portfolios. Journal of Investing, 13 (2), 37-47.

致谢


During my three years as a graduate student, I have received help and support from many people. I would like to express my most sincere gratitude to all the professors, classmates and family members who have given me support and help.


First of all, I would like to express my greatest gratitude to my tutor, Professor Liu Baixiao. I met the professor in a fixed-income course. Although the teaching was online at that time, the professor’s logical and clear teaching method made me gain a lot from the class. During the writing process of the thesis, the professor gave full play to his own advantages and gave me a lot of help from the beginning to the end. I gained a lot from every guidance document of the professor. In every offline discussion with professors, although I was very nervous before going, when we met to discuss, I would confirm my choice at that time.


Secondly, I would like to thank the members of each group team. Two years of cooperation have made our division of labor clear and extremely efficient. At the same time, they are also good partners who comfort and communicate with each other when looking for a job. I hope everyone can live the life they want after graduation. At the same time, I also want to thank the two beauties in 1114 for their initiative in allowing me to enter this dormitory and allowing me to spend a good three years in the dormitory.


Finally, I would like to thank my friends from the "working group", although most of them have left school. But leaving does not mean separation. I am grateful to them for providing emotional value in a timely manner when I shared. I am also grateful to them for accompanying me when I was in school. It allowed me to spend a very happy time without having to think about studies and work.


Finally, I would like to thank my parents for their moral and financial support during my 25 years of life. I hope they can have less worries and enjoy life more in the future.

13


Chapter name


Peking University Dissertation Originality Statement and Usage Authorization Instructions


Statement of originality


I solemnly declare that the thesis submitted is the result of my independent research work under the guidance of my supervisor. Except for the content cited in the text, this paper does not contain works or achievements that have been published or written by any other individual or collective. Individuals and groups who have made important contributions to the research of this article have been clearly identified in the article. Legal consequences of this statement shall be borne by himself.


Signature of the author of the paper: Date: Year Month Day


Dissertation License Instructions


(Must be bound in hard copy submitted to the school library)


I fully understand Peking University’s regulations on the collection, preservation, and use of dissertations, namely:


Submit the printed and electronic versions of the thesis as required by the school;


The school has the right to save the printed and electronic versions of dissertations, provide catalog search and reading services, and provide services on the campus network;


The school may use photocopying, microcopying, digitization or other reproduction methods to preserve papers;


If it is necessary to delay the release of the electronic version of the thesis due to some special reasons, the school is authorized to release the full text on the campus website after □ one year / □ two years / □ three years.


(Confidential papers will comply with this rule after being declassified)


Signature of thesis author: Signature of supervisor:


Date: year month day