Shannon Entropy V2

SPDR S&P 500 ETF TRUSTArca

Description

⋅ 2 juil. 2020 02:54

Version 2, Shannon Entropy
This update includes both a deadband (Plotting Optional) and PercentRank Indicating.

Here is a unique way of looking at your price & volume information. Utilize the calculated value of "Shannon Entropy". This is a measure of "surprise" in the data, the larger the move or deviation from the most probable value, the higher the new information gain. What I think is so interesting about this value, is the smoothness that it displays the information without using moving averages. There is a lot of meat on this bone to be incorporated in other scripts.

H = -sum(prob(i) * log_base2(prob(i)))

I've included the typical way that I've been experimenting with this, which is the difference between the volume information and price information. I've included the option to turn either price or volume data off to see the Shannon Entropy value of either value. There are a ton of complex scripts out there trying to do what this calculation is doing in 3 lines. As with anything, there are no free lunches, so you can nicely see as you lower the lengths you'll quickly learn where your nyquist frequencies are at, you'll want to work at about double the noisy value at a minimum.

Using this script is based on "Information" and it highlights places that need your attention, either because there is a large amount of change (new information) or there is minimal new information (complacency, institutional movements). Buy and Sell points are up to the user, this is just showing you where you need to provide some attention.

You can use it with or without volume data, you can also isolate either the source and volume. Below are some options for printing:

It can also with BTC (better with volume data)

Big shoutout to yatrader2 for great Shannon Entropy discussions.
And to Picte/ for his interesting inspiration STOCH-HVP-picte

Commentaires

gorx1

⋅ 16 mai 2022 12:00

Can't edit my prev comment...

Aight, after a lil analysis, here we go:

1) changing to natural logs doesn't change the values 'at all';
2) Shape of the resulting chart is exactly the same as if we drop a nested box car high pass filter (same length parameter) on the data;
- In this case, the values will be oscillating exactly above & below zero line, not around this interesting value resulting from a certain relationship between the entropy and window sizes;
- filter's formula: sma(src - sma(src, length), length)
3) Now the fun comes, you cant try yourself, you can exclude all these log10 from the line 28, essentially leaving the cumsums.. And, the resulting shape of your outputs will be exactly the same xd, and yes it will also be scale invariant and stuff;
4) So in essence, the whole thing about entropy and logarithms just basically changes the scale of this nested high-pass output, leaving the shape and invariance the same;

So my question is what's up with all that? Do I miss smth?

kocurekc

⋅ 17 mai 2022 14:16

@gorx1, You've not missed anything important, and you've noted the key difference which is simply the interpretability of the parameters.

From Regression and other stories, Gelman and Hill:
"We prefer natural logs (that is, logarithms base e) because, as described above, coefficients on the natural-log scale are directly interpretable as approximate proportional differences: with a coefficient of 0.06, a difference of 1 in x corresponds to an approximate 6% difference in y, and so forth."

Here your data is basically treated as discrete data (choose whatever binning - bar size you like) and you choose a base line data set to measure against (look backs distances set the average, true or population measure). The difference between most of the theories, they are set on continuous data, and the charts provide a discrete approximation of the continuous data (which actually isn't continuous).

You're just wading into the deep history of trying to measure something:
franknielsen.github.io/GSI/Poster-Distances.pdf

Entropy is just a way to quantify the information or uncertainty with something. Your choice of Ln() and Log() is your belief in the equation of the system. Is it base-10 exponential, or is it base-e Euler, your choice in the form of the solution for the differential equation. However, as you noted (and a long history of other smart people), it doesn’t matter, because you can translate between the two. I like Andrew Gelman and Richard McElreath the best, because they focus on explain ability and causation, and are honest about how hard getting signal from the noise.

Another item to note, your corollary comments to filtering are correct. It sounds like you are choosing more of a John Elhers approach to the market mesasoftware.com/TechnicalArticles.htm (and there is nothing wrong with this). If you want to filter the system to understand the underlying behavior, then you are sitting in the sin/cos world of base-solutions-forms, and these will work better with natural logs, ln()s.

gorx1

⋅ 17 mai 2022 16:03

@kocurekc, think you sir, but I still don't understand the relationship between windows sizes and entropy values, whatever the base. I notice that if you take length 100 your values gonna be around ~6.6 and will wary around 0.2. Or you take window length 25, the values will stay around 4.6 and wary around 0.2 again. If you know the underlying principle there, pls tell. To be more precise, how can I model the middle threshold?

kocurekc

⋅ 17 mai 2022 17:38

@gorx1, So give me some more information here...when you are referring to "windows sizes", are you wanting to talk about the "len" entropy length term, or "avg" the averaging length term.

gorx1

⋅ 17 mai 2022 20:25

I refer to "entropy length" values I've stated in my previous comment, and only "include source" checked. It's not so important tho, it's just my curiosity

kocurekc

⋅ 18 mai 2022 14:44

@gorx1, OK, so if you look at the length ("len") variable, this sets two major things. First, rather than probability from the standard Shannon Entropy equation, this creates a measure of proportion. The current bar divided by the sum of the bars over the length (line 28) this is a percentage. This is information that I'm wanting to measure, what is the contribution of the current bar (close) against all the previous closes in the past "len" periods. The second item is to convert this proportion into measure on the information over the period. So line 28 sums the log base-2 (information entropy) over the period.

As you've already experimented, there are several different combinations of ways that we can measure. The link above give ~20 other (but similar) ways, this was just an implementation of one. I liked this one, so I published it out.

gorx1

⋅ 16 mai 2022 10:52

Hi sup, great tool

1) I've noticed there's a strict relation between entropy values and moving window sizes. By any chance, do you know how to model it?;
2) Why log10 and not natural log? What's the logic behind making this kind of choice?.

Bro

jeno_

⋅ 3 févr. 2022 02:07

Interesting yet great! Between Shannon Entropy and Bernoulli Process which one do you prefer for lower timeframe? I find that Shannon Entropy generate more signal than Bernoulli but if I understand correctly what I read, Bernoulli were supposed to be better ?

kocurekc

⋅ 3 févr. 2022 16:09

@jeno_, Hey thanks, yeah these are always fun. To try and give you my experience.
I use Bernoulli in the same direction as the higher time frame direction, I use the 1-wk and 1-day charts and only take signals in the direction of the higher time frames. This would be the same as the 1-day & 1-hr chart combination.
For Shannon Entropy, just remember that it is direction independent, it is simply telling you more/less information from price/volume/both. I try not to make any moves when the values are peaking high (new information being generated), and then enter and close position at lows (less information being generated, the market changes have been digested). This is just another why to look at Volatility.

So I use them both, finally I also use OBV-ADX for turns...this is better for picking bottoms and tops (Volume exhaustion), Bernoulli for trading within range, Shannon for timing. I just swing trade, so I use wkly/daily combinations for scanning, then enter/exit on the 1hr/15min charts (but I don't watch these for signals). Finally (or really firstly), money management, then not over trading, and setting realistic expectations (15 to 30% net margins in these times, longer term 7/8% returns, because no one is really going to beat the market long term)

kocurekc

⋅ 3 févr. 2022 16:38

@jeno_, Here is the philosophy with NTLX (32% return, over 6-months...wait for the pandemic correction to be wash out, then lets enter again)...as a note, you can always try to catch the falling knife, but in doing this for a while now (14 years), that will be a losing proposition in the long run,...wait for the good setups (and don't watch CNBC, or typically do the opposite when they are losing their minds, in either direction)

Plus