Jump to content

Welcome to the new Traders Laboratory! Please bear with us as we finish the migration over the next few days. If you find any issues, want to leave feedback, get in touch with us, or offer suggestions please post to the Support forum here.

  • Welcome Guests

    Welcome. You are currently viewing the forum as a guest which does not give you access to all the great features at Traders Laboratory such as interacting with members, access to all forums, downloading attachments, and eligibility to win free giveaways. Registration is fast, simple and absolutely free. Create a FREE Traders Laboratory account here.

Firefly

T Tests

Recommended Posts

My question is regarding the importance of T testing. Below is a link i found with a T table in and i was wondering if i could ask you guys some questions regarding an exampe i found in the encyclopedia of trading?

 

http://www.sjsu.edu/faculty/gerstman/StatPrimer/t-table.pdf

 

info from example:

 

evaluating an optimised system on out of sample data.

 

No of trades = 47

 

Mean Avg Trade = 974.47

 

Standard Dev = each trade - sample mean

 

Sample Standard Deviation = SQRT( sum((Standard Dev)^2)/46)

= 6091.1

 

Expected Deviation from Mean = 6091.1/SQRT(47)

= 888.48

 

T-Stastic = 1.0968

 

this is the point where my understanding becomes a little hazy. Using the above link and running down the Yaxis to between 40 and 60 trades (47), and then running accross to between 1.05 and 1.303 (1.0968). im lost as to what both the upper and lower Xaxis are used for. the book talks about the following:

 

"The smaller the number, the more

likely the system performed the way it did for reasons other than chance. In this

instance, the probability was 0.1392; i.e., if a system with a true (population) profit of $0 was repeatedly tested on independent samples, only about 14% of the time

would it show a profit as high as that actually observed."

 

am i correct in assuming that as long as the value for the T-Statistic is less than the value stated on the T-test then its safe to assume the system hasnt been over optimised to the point of excessive curve fitting prior to running the out of sample data?

 

before the chapter ends, the books states the following as well:

 

"Finally, a considence interval on the probability of winning is estimated. In

the example, there were 16 wins in a sample of 47 trades, which yielded a percentage

of wins equal to 0.3404. Using a particular inverse of the cumulative binomial

distribution, upper 99% and lower 99% boundaries are calculated. There is a

99% probability that the percentage of wins in the population as a whole is

between 0.1702 and 0.5319. In Excel, the CRITBINOM function may be used in

the calculation of confidence intervals on percentages."

 

i dont understand this

 

id like to thank those of you who take the time to reply to this for your time and if you dont have sufficient time to answer id appreciate it if you could at least point me in the direction of some further reading on this subject, entry level. (never covered alot of stats at school)

 

Rgds

 

Firefly

Share this post


Link to post
Share on other sites

I would be more concerned with the few number of trades in the sample rather than getting all worked up about the T-test. Is it not possible to get more data and more trades - say at least 100. Forty seven is not enough to draw the conclusions you woiuld like to draw in my opinion.

Share this post


Link to post
Share on other sites

well the reason to use a T-test is when your sample size is very small... so it's an appropriate example.

 

yeah the T-test = (actual value - estimated value) / standard error

 

the estimated value is usually taken from a normal distribution. I don't remember how many trades occured in your example... I think it was 100... so say if you picked many people at random and then randomly picked 100 trades from the same opportunity space your example used...

 

once you normalize the results of these trades, the returns will look like a bell curve.

 

so what it's really saying is what is the likelyhood a random person could pick close to the same trades that your system did... say if there were 500 trading opportunities, and the average price of these opportunities was a certain price, and you were picking close to the average with your 100 choices then your results will be close to the average and your system might as well be random.

 

not sure if this is making sense or if I'm going too simple on the explanation side.

 

for curve fitting... at first glance it seems curve fitting is the other side of the t-test... obviously if your results are not obtained by random selection, they could be curve fit. but this is not necessarily true.... my understanding is that the t test cannot really tell you if you are curve fitting. it can only suggest whether your results are close enough to be called random, or maybe be something other than random. but it would be incorrect to say that just because a system is not random means it is curve fit.

 

curve fitting in the context of system development refers to more of something you *dont* want to do.... but yet you still need to acheive results that are not consistent with picking trades at random.

 

this means that curve fitting in trading is more of an art form than a science.... there is no way to detect curve fitting per say... you can use out of sample testing to indicate that curve fitting was occuring, but just because you get different results occurring out of sample does not mean your results were curve fit. They could have been random (a t test could tell you this), or it could have been that the distribution you are studying is not normal (you get a lot of fat tail distributions in finance).

 

hope that helps. for further reading, checkout the wikipedia article on statistical hypothesis testing.... http://en.wikipedia.org/wiki/Statistical_hypothesis_testing

 

here is their example which is good :

"As an example, consider determining whether a suitcase contains some radioactive material. Placed under a Geiger counter, it produces 10 counts per minute. The null hypothesis is that no radioactive material is in the suitcase and that all measured counts are due to ambient radioactivity typical of the surrounding air and harmless objects. We can then calculate how likely it is that the null hypothesis produces 10 counts per minute. If the null hypothesis predicts (say) on average 9 counts per minute and a standard deviation of 1 count per minute, then we say that the suitcase is compatible with the null hypothesis. (This does not guarantee that there is no radioactive material, just that we have no reason to believe it); on the other hand, if the null hypothesis predicts 3 counts per minute and a standard deviation of 1 count per minute, then the suitcase is not compatible with the null hypothesis, and there are likely other factors responsible to produce the measurements."

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.