import pandas as pd
import numpy as np
euro = pd.read_csv('https://stooq.pl/q/d/l/?s=eurpln&d1=20200101&d2=20221231&i=d')
euro
Data | Otwarcie | Najwyzszy | Najnizszy | Zamkniecie | |
---|---|---|---|---|---|
0 | 2020-01-02 | 4.25670 | 4.25805 | 4.23877 | 4.23943 |
1 | 2020-01-03 | 4.23964 | 4.25851 | 4.23757 | 4.24404 |
2 | 2020-01-06 | 4.24375 | 4.25129 | 4.23097 | 4.23154 |
3 | 2020-01-07 | 4.23159 | 4.25007 | 4.23098 | 4.24477 |
4 | 2020-01-08 | 4.24541 | 4.24941 | 4.23132 | 4.23317 |
... | ... | ... | ... | ... | ... |
771 | 2022-12-23 | 4.63904 | 4.65158 | 4.63454 | 4.64259 |
772 | 2022-12-27 | 4.64291 | 4.69614 | 4.64214 | 4.68116 |
773 | 2022-12-28 | 4.68145 | 4.70773 | 4.67490 | 4.68908 |
774 | 2022-12-29 | 4.68849 | 4.70441 | 4.67301 | 4.68387 |
775 | 2022-12-30 | 4.68346 | 4.69397 | 4.67162 | 4.68908 |
776 rows × 5 columns
kod = 'kgh'
poczatek = '20200101'
koniec = '20221231'
Tworzymy adres na podstawie zmiennych, co ułatwia zmianę parametów.
Gdy kolumnę jednocześnie oznaczamy jako indeks, a jednocześnie jako datę, to tabela uzyskuje indeks typu DatetimeIndex
.
kursy = pd.read_csv(f'https://stooq.pl/q/d/l/?s={kod}&d1={poczatek}&d2={koniec}&i=d',
index_col='Data', parse_dates=['Data'])
kursy
Otwarcie | Najwyzszy | Najnizszy | Zamkniecie | Wolumen | |
---|---|---|---|---|---|
Data | |||||
2020-01-02 | 92.2063 | 93.7722 | 92.2063 | 93.4852 | 243201.185407 |
2020-01-03 | 91.8625 | 92.6262 | 90.9458 | 91.7856 | 332923.666472 |
2020-01-07 | 91.6709 | 92.8169 | 90.7166 | 91.6334 | 329438.653029 |
2020-01-08 | 91.6141 | 92.1100 | 90.6781 | 90.6781 | 263026.093522 |
2020-01-09 | 91.7287 | 92.9700 | 91.7287 | 92.5309 | 545410.834420 |
... | ... | ... | ... | ... | ... |
2022-12-23 | 122.4960 | 124.2320 | 121.4050 | 123.0910 | 322849.795078 |
2022-12-27 | 125.9670 | 126.3150 | 124.0330 | 124.3800 | 319922.999996 |
2022-12-28 | 124.2810 | 126.1160 | 123.6860 | 124.3310 | 346183.499996 |
2022-12-29 | 124.7770 | 128.3980 | 123.4880 | 127.3070 | 418031.631143 |
2022-12-30 | 126.6620 | 127.4060 | 125.3230 | 125.7200 | 270589.918030 |
754 rows × 5 columns
kursy.index
DatetimeIndex(['2020-01-02', '2020-01-03', '2020-01-07', '2020-01-08', '2020-01-09', '2020-01-10', '2020-01-13', '2020-01-14', '2020-01-15', '2020-01-16', ... '2022-12-16', '2022-12-19', '2022-12-20', '2022-12-21', '2022-12-22', '2022-12-23', '2022-12-27', '2022-12-28', '2022-12-29', '2022-12-30'], dtype='datetime64[ns]', name='Data', length=754, freq=None)
Statystyki dla całości.
kursy.mean()
Otwarcie 133.584715 Najwyzszy 135.880646 Najnizszy 131.053503 Zamkniecie 133.434219 Wolumen 702127.015053 dtype: float64
kursy.agg({
'Otwarcie': ['min', 'mean', 'max'],
'Zamkniecie': ['min', 'mean', 'max'],
'Najnizszy': ['min'],
'Najwyzszy': ['max'],
})
Otwarcie | Zamkniecie | Najnizszy | Najwyzszy | |
---|---|---|---|---|
min | 49.655100 | 47.172500 | 45.8456 | NaN |
mean | 133.584715 | 133.434219 | NaN | NaN |
max | 217.817000 | 213.704000 | NaN | 220.397 |
kursy.Zamkniecie.describe()
count 754.000000 mean 133.434219 std 39.193850 min 47.172500 25% 100.985000 50% 131.948000 75% 170.110000 max 213.704000 Name: Zamkniecie, dtype: float64
Filtrowanie dat¶
Poza możliwościami takimi jak dla innych ineksów: wskazanie konkretnej wartości, wskazanie zakresu wierszy.
kursy.loc['2020-01-03']
Otwarcie 91.862500 Najwyzszy 92.626200 Najnizszy 90.945800 Zamkniecie 91.785600 Wolumen 332923.666472 Name: 2020-01-03 00:00:00, dtype: float64
kursy.loc['2020-01-03'].to_frame()
2020-01-03 | |
---|---|
Otwarcie | 91.862500 |
Najwyzszy | 92.626200 |
Najnizszy | 90.945800 |
Zamkniecie | 91.785600 |
Wolumen | 332923.666472 |
kursy.loc['2020-01-01':'2020-01-31']
Otwarcie | Najwyzszy | Najnizszy | Zamkniecie | Wolumen | |
---|---|---|---|---|---|
Data | |||||
2020-01-02 | 92.2063 | 93.7722 | 92.2063 | 93.4852 | 2.432012e+05 |
2020-01-03 | 91.8625 | 92.6262 | 90.9458 | 91.7856 | 3.329237e+05 |
2020-01-07 | 91.6709 | 92.8169 | 90.7166 | 91.6334 | 3.294387e+05 |
2020-01-08 | 91.6141 | 92.1100 | 90.6781 | 90.6781 | 2.630261e+05 |
2020-01-09 | 91.7287 | 92.9700 | 91.7287 | 92.5309 | 5.454108e+05 |
2020-01-10 | 92.6262 | 94.8228 | 92.6262 | 94.5358 | 7.248963e+05 |
2020-01-13 | 94.5358 | 95.6817 | 94.5358 | 95.4911 | 6.498969e+05 |
2020-01-14 | 95.5864 | 96.9239 | 95.0134 | 95.2995 | 4.852757e+05 |
2020-01-15 | 95.0134 | 95.0134 | 93.1231 | 94.3066 | 7.564785e+05 |
2020-01-16 | 94.3451 | 95.4718 | 94.1920 | 95.0134 | 4.163470e+05 |
2020-01-17 | 95.4911 | 96.4425 | 95.2426 | 96.4425 | 3.370837e+05 |
2020-01-20 | 96.5869 | 96.9721 | 95.5383 | 95.8743 | 2.308479e+05 |
2020-01-21 | 95.1665 | 95.1665 | 92.8169 | 93.3523 | 7.526144e+05 |
2020-01-22 | 93.1992 | 93.9436 | 92.2631 | 92.6262 | 5.304427e+05 |
2020-01-23 | 91.9579 | 93.7529 | 91.7662 | 92.0147 | 6.944585e+05 |
2020-01-24 | 92.6455 | 94.2305 | 92.6455 | 92.9700 | 8.510463e+05 |
2020-01-27 | 92.0147 | 92.0533 | 88.7685 | 88.9591 | 1.093153e+06 |
2020-01-28 | 89.3791 | 90.6213 | 87.6418 | 89.4753 | 7.447285e+05 |
2020-01-29 | 89.9906 | 91.2510 | 89.1123 | 89.8567 | 5.410743e+05 |
2020-01-30 | 87.5273 | 89.7229 | 87.0881 | 88.3294 | 5.647030e+05 |
2020-01-31 | 88.8061 | 89.4561 | 86.6297 | 87.0881 | 1.135376e+06 |
Mamy dodatkowe możliwości podania "całego miesiąca", "całego kwartału"...
kursy.loc['2020-02']
Otwarcie | Najwyzszy | Najnizszy | Zamkniecie | Wolumen | |
---|---|---|---|---|---|
Data | |||||
2020-02-03 | 87.6611 | 88.5008 | 87.3934 | 88.4247 | 5.073405e+05 |
2020-02-04 | 89.4368 | 92.2439 | 89.3791 | 92.0917 | 7.423006e+05 |
2020-02-05 | 92.6262 | 94.2498 | 91.7662 | 93.6575 | 6.040226e+05 |
2020-02-06 | 94.1544 | 94.7843 | 92.4355 | 93.2945 | 2.943424e+05 |
2020-02-07 | 92.2439 | 92.4355 | 89.9520 | 90.4682 | 6.271217e+05 |
2020-02-10 | 90.5827 | 91.1933 | 87.2210 | 88.0039 | 6.788902e+05 |
2020-02-11 | 89.0170 | 89.9520 | 88.1002 | 89.9520 | 9.209200e+05 |
2020-02-12 | 89.7614 | 92.4924 | 89.7614 | 92.4355 | 3.400765e+05 |
2020-02-13 | 91.1365 | 91.5563 | 89.7806 | 90.4104 | 3.439467e+05 |
2020-02-14 | 90.7166 | 90.7166 | 89.2076 | 89.6660 | 2.593645e+05 |
2020-02-17 | 90.6213 | 91.6141 | 90.2389 | 91.2510 | 1.192506e+05 |
2020-02-18 | 90.1244 | 90.4489 | 88.9024 | 89.1883 | 2.953933e+05 |
2020-02-19 | 89.7806 | 90.4489 | 89.3607 | 89.4936 | 2.721768e+05 |
2020-02-20 | 89.5900 | 90.4682 | 86.5152 | 88.1002 | 6.458885e+05 |
2020-02-21 | 86.4188 | 87.5648 | 85.7698 | 86.7819 | 3.970838e+05 |
2020-02-24 | 85.0630 | 85.0630 | 80.6130 | 81.0906 | 8.981355e+05 |
2020-02-25 | 81.0906 | 83.9748 | 80.2124 | 80.2317 | 9.587463e+05 |
2020-02-26 | 79.5441 | 80.1171 | 76.3922 | 78.3789 | 1.245142e+06 |
2020-02-27 | 77.8444 | 77.8444 | 71.8095 | 73.1268 | 7.898260e+05 |
2020-02-28 | 70.4719 | 71.6179 | 66.8434 | 67.3596 | 1.863144e+06 |
kursy.loc['2020-Q3']
Otwarcie | Najwyzszy | Najnizszy | Zamkniecie | Wolumen | |
---|---|---|---|---|---|
Data | |||||
2020-07-01 | 86.4766 | 88.8061 | 86.3620 | 88.7685 | 7.795391e+05 |
2020-07-02 | 88.7877 | 92.1486 | 88.5586 | 91.0796 | 1.135428e+06 |
2020-07-03 | 91.0796 | 91.6709 | 89.4936 | 90.0098 | 3.953891e+05 |
2020-07-06 | 91.7662 | 93.4852 | 91.4610 | 92.6262 | 6.074464e+05 |
2020-07-07 | 92.3585 | 92.7977 | 91.1933 | 92.6262 | 4.325020e+05 |
... | ... | ... | ... | ... | ... |
2020-09-24 | 112.6780 | 114.2100 | 110.6760 | 111.4840 | 7.439849e+05 |
2020-09-25 | 112.3410 | 116.0200 | 111.4370 | 112.7260 | 6.146303e+05 |
2020-09-28 | 115.4040 | 116.7810 | 113.4390 | 115.9710 | 4.908563e+05 |
2020-09-29 | 115.5480 | 115.9710 | 112.7260 | 112.9190 | 3.967744e+05 |
2020-09-30 | 112.3410 | 113.0630 | 109.3370 | 112.6780 | 5.992676e+05 |
66 rows × 5 columns
kursy.loc['2020'].Zamkniecie.mean()
103.60035555555557
kursy.Zamkniecie.loc['2020'].mean()
103.60035555555557
Uzupełnianie brakujących dat¶
Kursy giełdowe i walutowe są podane tylko dla dni roboczych. Innego typu dane osadzone w osi czasu też mogą mieć luki.
Czasami możemy mieć potrzebę, aby uzupełnić brakujące daty...
kursy.index.min(), kursy.index.max()
(Timestamp('2020-01-02 00:00:00'), Timestamp('2022-12-30 00:00:00'))
daty = pd.date_range(kursy.index.min(), kursy.index.max())
daty
DatetimeIndex(['2020-01-02', '2020-01-03', '2020-01-04', '2020-01-05', '2020-01-06', '2020-01-07', '2020-01-08', '2020-01-09', '2020-01-10', '2020-01-11', ... '2022-12-21', '2022-12-22', '2022-12-23', '2022-12-24', '2022-12-25', '2022-12-26', '2022-12-27', '2022-12-28', '2022-12-29', '2022-12-30'], dtype='datetime64[ns]', length=1094, freq='D')
# daty = pd.date_range('2020-01-01', '2022-12-31')
# daty
kursy2 = pd.DataFrame(kursy[['Otwarcie', 'Zamkniecie']], index=daty)
kursy2.head(12)
Otwarcie | Zamkniecie | |
---|---|---|
2020-01-02 | 92.2063 | 93.4852 |
2020-01-03 | 91.8625 | 91.7856 |
2020-01-04 | NaN | NaN |
2020-01-05 | NaN | NaN |
2020-01-06 | NaN | NaN |
2020-01-07 | 91.6709 | 91.6334 |
2020-01-08 | 91.6141 | 90.6781 |
2020-01-09 | 91.7287 | 92.5309 |
2020-01-10 | 92.6262 | 94.5358 |
2020-01-11 | NaN | NaN |
2020-01-12 | NaN | NaN |
2020-01-13 | 94.5358 | 95.4911 |
Jeśli danego dnia nie było notowań, to przyjmujemy, że obowiązuje kurs z poprzedniego dnia roboczego.
kursy2.fillna(method='ffill', inplace=True)
kursy2
Otwarcie | Zamkniecie | |
---|---|---|
2020-01-02 | 92.2063 | 93.4852 |
2020-01-03 | 91.8625 | 91.7856 |
2020-01-04 | 91.8625 | 91.7856 |
2020-01-05 | 91.8625 | 91.7856 |
2020-01-06 | 91.8625 | 91.7856 |
... | ... | ... |
2022-12-26 | 122.4960 | 123.0910 |
2022-12-27 | 125.9670 | 124.3800 |
2022-12-28 | 124.2810 | 124.3310 |
2022-12-29 | 124.7770 | 127.3070 |
2022-12-30 | 126.6620 | 125.7200 |
1094 rows × 2 columns
Grupowanie i resample¶
Jeśli dane chcemy pogrupować po okresach czasu, np. miesięcznie, to możmy użyć znanego nam groupby w połączeniu z odczytem miesiąca lub funkcją formatującą.
kursy.groupby(kursy.index.month).Zamkniecie.mean()
Data 1 138.533370 2 138.760383 3 135.510346 4 136.677776 5 133.563093 6 132.542730 7 130.693433 8 138.050235 9 124.574086 10 120.587855 11 129.625712 12 143.217032 Name: Zamkniecie, dtype: float64
kursy.groupby(kursy.index.isocalendar().week).Zamkniecie.mean()
week 1 151.806580 2 146.744300 3 141.777407 4 135.650900 5 134.563907 6 138.840640 7 141.861453 8 142.344200 9 141.335107 10 139.847993 11 132.049060 12 130.483600 13 131.267479 14 132.379164 15 141.509823 16 142.365877 17 132.927967 18 134.031575 19 131.758707 20 130.768840 21 134.031440 22 135.028207 23 138.278673 24 134.751315 25 126.680220 26 127.899807 27 125.572547 28 125.210253 29 128.195000 30 134.653667 31 140.471133 32 141.988067 33 137.792429 34 134.555167 35 129.968980 36 129.128387 37 129.055953 38 123.601113 39 117.164800 40 117.968327 41 119.976987 42 122.382473 43 121.761840 44 120.137154 45 128.298846 46 130.785571 47 128.416200 48 131.744933 49 137.744600 50 145.293600 51 145.002143 52 142.129636 53 177.470667 Name: Zamkniecie, dtype: float64
kursy.groupby(kursy.index.strftime('%Y-%m')).Zamkniecie.mean()
Data 2020-01 92.464200 2020-02 86.670350 2020-03 57.940477 2020-04 66.745290 2020-05 75.215285 2020-06 84.529429 2020-07 105.116583 2020-08 129.067762 2020-09 124.183773 2020-10 116.592455 2020-11 133.815300 2020-12 172.081350 2021-01 188.127895 2021-02 185.459900 2021-03 176.876435 2021-04 188.607850 2021-05 198.728800 2021-06 184.369048 2021-07 182.219955 2021-08 178.950273 2021-09 162.297227 2021-10 154.491905 2021-11 141.902350 2021-12 136.027952 2022-01 139.791200 2022-02 144.150900 2022-03 168.341522 2022-04 155.627684 2022-05 127.069857 2022-06 128.729714 2022-07 104.726010 2022-08 105.724377 2022-09 87.241259 2022-10 90.869462 2022-11 113.159485 2022-12 122.916286 Name: Zamkniecie, dtype: float64
Istnieje jednak dedykowana operacja resample
- czyli odczyt danych z inną częstotliwością.
kursy.resample('M').Zamkniecie.mean()
Data 2020-01-31 92.464200 2020-02-29 86.670350 2020-03-31 57.940477 2020-04-30 66.745290 2020-05-31 75.215285 2020-06-30 84.529429 2020-07-31 105.116583 2020-08-31 129.067762 2020-09-30 124.183773 2020-10-31 116.592455 2020-11-30 133.815300 2020-12-31 172.081350 2021-01-31 188.127895 2021-02-28 185.459900 2021-03-31 176.876435 2021-04-30 188.607850 2021-05-31 198.728800 2021-06-30 184.369048 2021-07-31 182.219955 2021-08-31 178.950273 2021-09-30 162.297227 2021-10-31 154.491905 2021-11-30 141.902350 2021-12-31 136.027952 2022-01-31 139.791200 2022-02-28 144.150900 2022-03-31 168.341522 2022-04-30 155.627684 2022-05-31 127.069857 2022-06-30 128.729714 2022-07-31 104.726010 2022-08-31 105.724377 2022-09-30 87.241259 2022-10-31 90.869462 2022-11-30 113.159485 2022-12-31 122.916286 Freq: M, Name: Zamkniecie, dtype: float64
kursy.resample('MS').Zamkniecie.mean()
Data 2020-01-01 92.464200 2020-02-01 86.670350 2020-03-01 57.940477 2020-04-01 66.745290 2020-05-01 75.215285 2020-06-01 84.529429 2020-07-01 105.116583 2020-08-01 129.067762 2020-09-01 124.183773 2020-10-01 116.592455 2020-11-01 133.815300 2020-12-01 172.081350 2021-01-01 188.127895 2021-02-01 185.459900 2021-03-01 176.876435 2021-04-01 188.607850 2021-05-01 198.728800 2021-06-01 184.369048 2021-07-01 182.219955 2021-08-01 178.950273 2021-09-01 162.297227 2021-10-01 154.491905 2021-11-01 141.902350 2021-12-01 136.027952 2022-01-01 139.791200 2022-02-01 144.150900 2022-03-01 168.341522 2022-04-01 155.627684 2022-05-01 127.069857 2022-06-01 128.729714 2022-07-01 104.726010 2022-08-01 105.724377 2022-09-01 87.241259 2022-10-01 90.869462 2022-11-01 113.159485 2022-12-01 122.916286 Freq: MS, Name: Zamkniecie, dtype: float64
M
i MS
różnią wartościami zwracanymi w indeksie, ale średnie i tak są liczone z całych miesięcy.
kursy.resample('W').Zamkniecie.mean()
Data 2020-01-05 92.63540 2020-01-12 92.34455 2020-01-19 95.31062 2020-01-26 93.36750 2020-02-02 88.74172 ... 2022-12-04 115.40420 2022-12-11 120.12540 2022-12-18 124.35060 2022-12-25 124.18200 2023-01-01 125.43450 Freq: W-SUN, Name: Zamkniecie, Length: 157, dtype: float64
kursy[:'2020-01-05']
Otwarcie | Najwyzszy | Najnizszy | Zamkniecie | Wolumen | |
---|---|---|---|---|---|
Data | |||||
2020-01-02 | 92.2063 | 93.7722 | 92.2063 | 93.4852 | 243201.185407 |
2020-01-03 | 91.8625 | 92.6262 | 90.9458 | 91.7856 | 332923.666472 |
Podobnie jak dla wyników groupby
, tak i tu można przeglądać dane, stosować funkcje agregujące do wybranych kolumn, albo ogólnie użyć agg
.
# w przypadku kursów walut lub indeksów giełdowych, należy pominąć wolumen
kursy.resample('Q').agg({
'Otwarcie': ['first', 'min', 'mean', 'max'],
'Zamkniecie': ['last', 'min', 'mean', 'max'],
'Najnizszy': ['min'],
'Najwyzszy': ['max'],
'Wolumen': ['sum']
})
Otwarcie | Zamkniecie | Najnizszy | Najwyzszy | Wolumen | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
first | min | mean | max | last | min | mean | max | min | max | sum | |
Data | |||||||||||
2020-03-31 | 92.2063 | 49.6551 | 79.096583 | 96.5869 | 57.1231 | 47.1725 | 78.568979 | 96.4425 | 45.8456 | 96.9721 | 5.349714e+07 |
2020-06-30 | 56.3392 | 56.2439 | 75.431884 | 87.3934 | 86.8011 | 56.0724 | 75.644746 | 87.3934 | 54.6395 | 88.1378 | 4.655560e+07 |
2020-09-30 | 86.4766 | 86.4766 | 118.445233 | 132.3520 | 112.6780 | 88.7685 | 119.093142 | 133.0640 | 86.3620 | 133.6430 | 4.961910e+07 |
2020-12-31 | 113.1590 | 108.5760 | 138.983710 | 180.4820 | 174.7520 | 109.8180 | 140.047855 | 180.8090 | 106.9480 | 183.2940 | 4.052619e+07 |
2021-03-31 | 177.5160 | 161.3280 | 183.477000 | 207.1180 | 181.6280 | 159.1800 | 183.093323 | 206.1640 | 158.5170 | 211.0370 | 3.835844e+07 |
2021-06-30 | 181.4350 | 170.4480 | 190.979902 | 217.8170 | 180.7030 | 170.8810 | 190.466934 | 213.7040 | 168.1370 | 220.3970 | 3.940795e+07 |
2021-09-30 | 181.0400 | 149.2620 | 175.253015 | 190.6700 | 151.8140 | 148.5390 | 174.489152 | 193.0770 | 147.1920 | 193.0770 | 3.166962e+07 |
2021-12-31 | 150.2250 | 128.2690 | 144.787226 | 167.0780 | 134.2400 | 128.1720 | 144.176839 | 164.1400 | 126.6320 | 168.0880 | 4.283083e+07 |
2022-03-31 | 135.5390 | 132.0250 | 151.446968 | 175.2620 | 166.4990 | 130.0510 | 151.598365 | 175.4550 | 128.7510 | 186.4330 | 5.125558e+07 |
2022-06-30 | 166.5950 | 110.7420 | 137.585377 | 169.3400 | 114.4980 | 110.0690 | 136.536344 | 169.3400 | 108.1430 | 170.3520 | 4.648571e+07 |
2022-09-30 | 112.9570 | 81.8293 | 99.687940 | 114.5110 | 86.9473 | 83.5353 | 99.146003 | 114.9580 | 80.0241 | 114.9580 | 4.634211e+07 |
2022-12-31 | 85.8959 | 85.8959 | 108.413421 | 127.8020 | 125.7200 | 86.8283 | 108.914361 | 127.6540 | 85.2016 | 128.6950 | 4.285551e+07 |
kursy.resample('W').agg({
'Otwarcie': ['first', 'min', 'mean', 'max'],
'Zamkniecie': ['last', 'min', 'mean', 'max'],
'Najnizszy': ['min'],
'Najwyzszy': ['max'],
'Wolumen': ['sum']
}).to_excel('resample.xlsx')
Różnice między sąsiednimi wartościami¶
Podstawą tego typu analizy jest operacja shift
, która przesuwa dane względem indeksu.
kursy.head(5)
Otwarcie | Najwyzszy | Najnizszy | Zamkniecie | Wolumen | |
---|---|---|---|---|---|
Data | |||||
2020-01-02 | 92.2063 | 93.7722 | 92.2063 | 93.4852 | 243201.185407 |
2020-01-03 | 91.8625 | 92.6262 | 90.9458 | 91.7856 | 332923.666472 |
2020-01-07 | 91.6709 | 92.8169 | 90.7166 | 91.6334 | 329438.653029 |
2020-01-08 | 91.6141 | 92.1100 | 90.6781 | 90.6781 | 263026.093522 |
2020-01-09 | 91.7287 | 92.9700 | 91.7287 | 92.5309 | 545410.834420 |
kursy.shift(2).head(5)
Otwarcie | Najwyzszy | Najnizszy | Zamkniecie | Wolumen | |
---|---|---|---|---|---|
Data | |||||
2020-01-02 | NaN | NaN | NaN | NaN | NaN |
2020-01-03 | NaN | NaN | NaN | NaN | NaN |
2020-01-07 | 92.2063 | 93.7722 | 92.2063 | 93.4852 | 243201.185407 |
2020-01-08 | 91.8625 | 92.6262 | 90.9458 | 91.7856 | 332923.666472 |
2020-01-09 | 91.6709 | 92.8169 | 90.7166 | 91.6334 | 329438.653029 |
kursy.shift(-3).head(5)
Otwarcie | Najwyzszy | Najnizszy | Zamkniecie | Wolumen | |
---|---|---|---|---|---|
Data | |||||
2020-01-02 | 91.6141 | 92.1100 | 90.6781 | 90.6781 | 263026.093522 |
2020-01-03 | 91.7287 | 92.9700 | 91.7287 | 92.5309 | 545410.834420 |
2020-01-07 | 92.6262 | 94.8228 | 92.6262 | 94.5358 | 724896.295811 |
2020-01-08 | 94.5358 | 95.6817 | 94.5358 | 95.4911 | 649896.854255 |
2020-01-09 | 95.5864 | 96.9239 | 95.0134 | 95.2995 | 485275.660568 |
Domyślną wartością w shift
jest 1. W oparciu o shift można wykonać dowolne obliczenia angażujące sąsiednie wartości w serii.
Przykład: różnica kursów (kwotowo).
kursy.Zamkniecie - kursy.Zamkniecie.shift()
Data 2020-01-02 NaN 2020-01-03 -1.6996 2020-01-07 -0.1522 2020-01-08 -0.9553 2020-01-09 1.8528 ... 2022-12-23 0.2980 2022-12-27 1.2890 2022-12-28 -0.0490 2022-12-29 2.9760 2022-12-30 -1.5870 Name: Zamkniecie, Length: 754, dtype: float64
Jednak dla najbardziej typowych potrzeb istnieją gotowe metody.
diff
- różnica wartościpct_change
- zmiana procentowa
kursy.Zamkniecie.diff()
Data 2020-01-02 NaN 2020-01-03 -1.6996 2020-01-07 -0.1522 2020-01-08 -0.9553 2020-01-09 1.8528 ... 2022-12-23 0.2980 2022-12-27 1.2890 2022-12-28 -0.0490 2022-12-29 2.9760 2022-12-30 -1.5870 Name: Zamkniecie, Length: 754, dtype: float64
kursy.Zamkniecie.pct_change()
Data 2020-01-02 NaN 2020-01-03 -0.018180 2020-01-07 -0.001658 2020-01-08 -0.010425 2020-01-09 0.020433 ... 2022-12-23 0.002427 2022-12-27 0.010472 2022-12-28 -0.000394 2022-12-29 0.023936 2022-12-30 -0.012466 Name: Zamkniecie, Length: 754, dtype: float64
kursy['Zmiana_kwotowa'] = kursy.Zamkniecie.diff()
kursy['Zmiana_procentowa'] = kursy.Zamkniecie.pct_change()
kursy
Otwarcie | Najwyzszy | Najnizszy | Zamkniecie | Wolumen | Zmiana_kwotowa | Zmiana_procentowa | |
---|---|---|---|---|---|---|---|
Data | |||||||
2020-01-02 | 92.2063 | 93.7722 | 92.2063 | 93.4852 | 243201.185407 | NaN | NaN |
2020-01-03 | 91.8625 | 92.6262 | 90.9458 | 91.7856 | 332923.666472 | -1.6996 | -0.018180 |
2020-01-07 | 91.6709 | 92.8169 | 90.7166 | 91.6334 | 329438.653029 | -0.1522 | -0.001658 |
2020-01-08 | 91.6141 | 92.1100 | 90.6781 | 90.6781 | 263026.093522 | -0.9553 | -0.010425 |
2020-01-09 | 91.7287 | 92.9700 | 91.7287 | 92.5309 | 545410.834420 | 1.8528 | 0.020433 |
... | ... | ... | ... | ... | ... | ... | ... |
2022-12-23 | 122.4960 | 124.2320 | 121.4050 | 123.0910 | 322849.795078 | 0.2980 | 0.002427 |
2022-12-27 | 125.9670 | 126.3150 | 124.0330 | 124.3800 | 319922.999996 | 1.2890 | 0.010472 |
2022-12-28 | 124.2810 | 126.1160 | 123.6860 | 124.3310 | 346183.499996 | -0.0490 | -0.000394 |
2022-12-29 | 124.7770 | 128.3980 | 123.4880 | 127.3070 | 418031.631143 | 2.9760 | 0.023936 |
2022-12-30 | 126.6620 | 127.4060 | 125.3230 | 125.7200 | 270589.918030 | -1.5870 | -0.012466 |
754 rows × 7 columns
Obliczenia skumulowane i "okienkowe" ("przesuwne").
kursy.Wolumen.head(10)
Data 2020-01-02 243201.185407 2020-01-03 332923.666472 2020-01-07 329438.653029 2020-01-08 263026.093522 2020-01-09 545410.834420 2020-01-10 724896.295811 2020-01-13 649896.854255 2020-01-14 485275.660568 2020-01-15 756478.451303 2020-01-16 416346.954585 Name: Wolumen, dtype: float64
kursy.Wolumen.head(10).cumsum()
Data 2020-01-02 2.432012e+05 2020-01-03 5.761249e+05 2020-01-07 9.055635e+05 2020-01-08 1.168590e+06 2020-01-09 1.714000e+06 2020-01-10 2.438897e+06 2020-01-13 3.088794e+06 2020-01-14 3.574069e+06 2020-01-15 4.330548e+06 2020-01-16 4.746895e+06 Name: Wolumen, dtype: float64
Za pomocą rolling
można wybrać N sąsiednich wartości i obliczać funkcje agregujące, np. średnią, z tych sąsiednich wartości.
kursy.Zamkniecie.rolling(window=5).mean().head(10)
Data 2020-01-02 NaN 2020-01-03 NaN 2020-01-07 NaN 2020-01-08 NaN 2020-01-09 92.02264 2020-01-10 92.23276 2020-01-13 92.97386 2020-01-14 93.70708 2020-01-15 94.43278 2020-01-16 94.92928 Name: Zamkniecie, dtype: float64
kursy['Średnia 5'] = kursy.Zamkniecie.rolling(window=5).mean()
kursy.head(15)
Otwarcie | Najwyzszy | Najnizszy | Zamkniecie | Wolumen | Zmiana_kwotowa | Zmiana_procentowa | Średnia 5 | |
---|---|---|---|---|---|---|---|---|
Data | ||||||||
2020-01-02 | 92.2063 | 93.7722 | 92.2063 | 93.4852 | 243201.185407 | NaN | NaN | NaN |
2020-01-03 | 91.8625 | 92.6262 | 90.9458 | 91.7856 | 332923.666472 | -1.6996 | -0.018180 | NaN |
2020-01-07 | 91.6709 | 92.8169 | 90.7166 | 91.6334 | 329438.653029 | -0.1522 | -0.001658 | NaN |
2020-01-08 | 91.6141 | 92.1100 | 90.6781 | 90.6781 | 263026.093522 | -0.9553 | -0.010425 | NaN |
2020-01-09 | 91.7287 | 92.9700 | 91.7287 | 92.5309 | 545410.834420 | 1.8528 | 0.020433 | 92.02264 |
2020-01-10 | 92.6262 | 94.8228 | 92.6262 | 94.5358 | 724896.295811 | 2.0049 | 0.021667 | 92.23276 |
2020-01-13 | 94.5358 | 95.6817 | 94.5358 | 95.4911 | 649896.854255 | 0.9553 | 0.010105 | 92.97386 |
2020-01-14 | 95.5864 | 96.9239 | 95.0134 | 95.2995 | 485275.660568 | -0.1916 | -0.002006 | 93.70708 |
2020-01-15 | 95.0134 | 95.0134 | 93.1231 | 94.3066 | 756478.451303 | -0.9929 | -0.010419 | 94.43278 |
2020-01-16 | 94.3451 | 95.4718 | 94.1920 | 95.0134 | 416346.954585 | 0.7068 | 0.007495 | 94.92928 |
2020-01-17 | 95.4911 | 96.4425 | 95.2426 | 96.4425 | 337083.667619 | 1.4291 | 0.015041 | 95.31062 |
2020-01-20 | 96.5869 | 96.9721 | 95.5383 | 95.8743 | 230847.871965 | -0.5682 | -0.005892 | 95.38726 |
2020-01-21 | 95.1665 | 95.1665 | 92.8169 | 93.3523 | 752614.406303 | -2.5220 | -0.026305 | 94.99782 |
2020-01-22 | 93.1992 | 93.9436 | 92.2631 | 92.6262 | 530442.722453 | -0.7261 | -0.007778 | 94.66174 |
2020-01-23 | 91.9579 | 93.7529 | 91.7662 | 92.0147 | 694458.504090 | -0.6115 | -0.006602 | 94.06200 |
Opcja center
powoduje, że średnia będzie umieszczona pod tym indeksem, który jest środkiem przedziału (a nie końcem).
kursy.Zamkniecie.rolling(window=5, center=True).mean().head(10)
Data 2020-01-02 NaN 2020-01-03 NaN 2020-01-07 92.02264 2020-01-08 92.23276 2020-01-09 92.97386 2020-01-10 93.70708 2020-01-13 94.43278 2020-01-14 94.92928 2020-01-15 95.31062 2020-01-16 95.38726 Name: Zamkniecie, dtype: float64
kursy.Zamkniecie.rolling(window=5, center=True, closed='left').mean().head(10)
Data 2020-01-02 NaN 2020-01-03 NaN 2020-01-07 NaN 2020-01-08 92.02264 2020-01-09 92.23276 2020-01-10 92.97386 2020-01-13 93.70708 2020-01-14 94.43278 2020-01-15 94.92928 2020-01-16 95.31062 Name: Zamkniecie, dtype: float64
kursy.Zamkniecie.rolling(window=5, center=True, closed='right').mean().head(10)
Data 2020-01-02 NaN 2020-01-03 NaN 2020-01-07 92.02264 2020-01-08 92.23276 2020-01-09 92.97386 2020-01-10 93.70708 2020-01-13 94.43278 2020-01-14 94.92928 2020-01-15 95.31062 2020-01-16 95.38726 Name: Zamkniecie, dtype: float64
Wstęga Boelingera¶
Bazując na średniej kroczącej oraz odchyleniu standardowym można wystaczyć "korytarz".
Będzie sprawdzać czy faktyczny kurs mieści się w tym korytarzu.
WINDOW = 10
KORYTARZ = 2
srednia_kroczaca = kursy.Zamkniecie.rolling(window=WINDOW).mean()
odchylenie = kursy.Zamkniecie.rolling(window=WINDOW).std()
wstega_gorna = srednia_kroczaca + KORYTARZ * odchylenie
wstega_dolna = srednia_kroczaca - KORYTARZ * odchylenie
dfb = pd.DataFrame({
'Bieżące': kursy.Zamkniecie,
'Trend': srednia_kroczaca,
'Górna': wstega_gorna,
'Dolna': wstega_dolna,
})
dfb.iloc[8:28]
Bieżące | Trend | Górna | Dolna | |
---|---|---|---|---|
Data | ||||
2020-01-15 | 94.3066 | NaN | NaN | NaN |
2020-01-16 | 95.0134 | 93.47596 | 96.910207 | 90.041713 |
2020-01-17 | 96.4425 | 93.77169 | 97.685332 | 89.858048 |
2020-01-20 | 95.8743 | 94.18056 | 98.025730 | 90.335390 |
2020-01-21 | 93.3523 | 94.35245 | 97.827410 | 90.877490 |
2020-01-22 | 92.6262 | 94.54726 | 97.236259 | 91.858261 |
2020-01-23 | 92.0147 | 94.49564 | 97.370096 | 91.621184 |
2020-01-24 | 92.9700 | 94.33906 | 97.370115 | 91.308005 |
2020-01-27 | 88.9591 | 93.68586 | 98.109106 | 89.262614 |
2020-01-28 | 89.4753 | 93.10344 | 98.081362 | 88.125518 |
2020-01-29 | 89.8567 | 92.65845 | 97.944402 | 87.372498 |
2020-01-30 | 88.3294 | 91.99005 | 97.630980 | 86.349120 |
2020-01-31 | 87.0881 | 91.05461 | 96.513528 | 85.595692 |
2020-02-03 | 88.4247 | 90.30965 | 94.791062 | 85.828238 |
2020-02-04 | 92.0917 | 90.18359 | 94.344036 | 86.023144 |
2020-02-05 | 93.6575 | 90.28672 | 94.755938 | 85.817502 |
2020-02-06 | 93.2945 | 90.41470 | 95.168101 | 85.661299 |
2020-02-07 | 90.4682 | 90.16452 | 94.570866 | 85.758174 |
2020-02-10 | 88.0039 | 90.06900 | 94.630176 | 85.507824 |
2020-02-11 | 89.9520 | 90.11667 | 94.660199 | 85.573141 |
dfb.plot()
<Axes: xlabel='Data'>
wykres = dfb.plot(figsize=(20, 10), grid=True)
wykres.fill_between(dfb.index, dfb["Górna"], dfb["Dolna"], color="#FFFFCC", alpha=0.25)
wykres
<Axes: xlabel='Data'>
wykres.get_figure().savefig('wstega.png')