sons dataStory

글

11월, 2024의 게시물 표시

Statistical analysis results graph code

The following graphs are the codes for the figures included in Chapters 6 and 7 of the e-book Statistics with Python . import numpy as np import pandas as pd from scipy import stats from sklearn.preprocessing import StandardScaler import FinanceDataReader as fdr import yfinance as yf import statsmodels.api as sm from statsmodels.formula.api import ols from sklearn.linear_model import LinearRegression import matplotlib.pyplot as plt import seaborn as sns sns.set_style("darkgrid") #fig 611 st=pd.Timestamp(2024,1, 1) et=pd.Timestamp(2024, 5, 30) code=["^KS11", "^KQ11", "^DJI", "KRW=X"] nme=['kos','kq','dj','WonDol'] da=pd.DataFrame() for i in code: x=yf.download(i,st, et)['Close'] x1=x.pct_change() da=pd.concat([da, x1], axis=1) da.columns=nme da1=da.dropna() da1.index=range(len(da1)) da2=pd.melt(da1, value_vars=['kos', 'kq', 'dj', 'WonDol'], var_name=...

Graph code related to statistical tests

The following graphs are the codes for the figures included in Chapter 5 of the e-book Statistics with Python . import numpy as np import pandas as pd from scipy import stats from sklearn.preprocessing import StandardScaler import yfinance as yf import matplotlib.pyplot as plt import seaborn as sns sns.set_style("darkgrid") #fig511 st=pd.Timestamp(2024, 4,20) et=pd.Timestamp(2024, 5, 30) da1=yf.download('GOOGL', st, et)["Close"] da2=yf.download('MSFT', st, et)["Close"] da1=da1.iloc[:,0].pct_change()[1:]*100 da2=da2.iloc[:,0].pct_change()[1:]*100 da=pd.DataFrame([da1, da2], index=['data1', 'data2']).T da.index=range(len(da1)) mu1, sd1, n1=np.mean(da1), np.std(da1, ddof=1), len(da1) mu2, sd2, n2=np.mean(da2), np.std(da2, ddof=1), len(da2) s_p=np.sqrt(((n1-1)*sd1**2+(n2-1)*sd2**2)/(n1+n2-2)) se=s_p*np.sqrt((1/n1+1/n2)) df=n1+n2-2 mu=mu1-mu2 testStatic=((mu1-mu2)-0)/se x=np.linspace(-3, 3, 500) p=stats.t.pdf(x, df) l=stats.t...

Statistics related graph code

The following graphs are the codes for the figures included in Chapter 4 of the e-book Statistics with Python . import numpy as np import pandas as pd from scipy import stats from sklearn.preprocessing import StandardScaler import yfinance as yf import matplotlib.pyplot as plt import seaborn as sns sns.set_style("darkgrid") # fig 421 st=pd.Timestamp(2022,12, 1) et=pd.Timestamp(2023, 4, 1) da=yf.download("^DJI", st, et) pop=(da['Close']-da['Open'])/da['Open']*100 pop.columns=['change'] pop.index=pd.DatetimeIndex(pop.index.date) xBar=np.array([]) for i in range(20): x=pop.sample(5, replace=False, random_state=i) xBar=np.append(xBar, x.mean()) xBar2=np.array([]) for i in range(100): x=pop.sample(5, replace=False, random_state=i) xBar2=np.append(xBar2, x.mean()) plt.figure(figsize=(7, 3)) plt.subplot(1,2,1) plt.hist(xBar, 10, rwidth=0.8, color="blue", label="n=20") plt.axvline(-0.18, 0, 0.9, color=...

통계분석 결과 그래프

다음 그래프들은 전자책 파이썬과 함께하는 통계이야기 6 장과 7장에 수록된 그림들의 코드들입니다. import numpy as np import pandas as pd from scipy import stats from sklearn.preprocessing import StandardScaler import FinanceDataReader as fdr import yfinance as yf import statsmodels.api as sm from statsmodels.formula.api import ols from sklearn.linear_model import LinearRegression import matplotlib.pyplot as plt import seaborn as sns sns.set_style("darkgrid") #fig 611 st=pd.Timestamp(2024,1, 1) et=pd.Timestamp(2024, 5, 30) code=["^KS11", "^KQ11", "^DJI", "KRW=X"] nme=['kos','kq','dj','WonDol'] da=pd.DataFrame() for i in code: x=yf.download(i,st, et)['Close'] x1=x.pct_change() da=pd.concat([da, x1], axis=1) da.columns=nme da1=da.dropna() da1.index=range(len(da1)) da2=pd.melt(da1, value_vars=['kos', 'kq', 'dj', 'WonDol'], var_name="idx", value_name="val") model=ols("val~...

통계 검정에 관련된 그래프

다음 그래프들은 전자책 파이썬과 함께하는 통계이야기 5 장에 수록된 그림들의 코드들입니다. import numpy as np import pandas as pd from scipy import stats from sklearn.preprocessing import StandardScaler import FinanceDataReader as fdr import yfinance as yf import matplotlib.pyplot as plt import seaborn as sns sns.set_style("darkgrid") #fig 511 st=pd.Timestamp(2024, 4,20) et=pd.Timestamp(2024, 5, 30) da1=fdr.DataReader('091160', st, et)["Close"] da2=fdr.DataReader('005930', st, et)["Close"] da1=da1.pct_change()[1:]*100 da2=da2.pct_change()[1:]*100 da=pd.DataFrame([da1, da2], index=['data1', 'data2']).T da.index=range(len(da1)) mu1, sd1, n1=np.mean(da1), np.std(da1, ddof=1), len(da1) mu2, sd2, n2=np.mean(da2), np.std(da2, ddof=1), len(da2) s_p=np.sqrt(((n1-1)*sd1**2+(n2-1)*sd2**2)/(n1+n2-2)) se=s_p*np.sqrt((1/n1+1/n2)) se=s_p*np.sqrt((1/n1+1/n2)) df=n1+n2-2 mu=mu1-mu2 ci=stats.t.interval(0.95, df, mu, se) testStatic=((mu1-mu2)-0)/se x=np.linspace(-3, 3, 500) ...

통계관련 그래프

다음 그래프들은 전자책 파이썬과 함께하는 통계이야기 4 장에 수록된 그림들의 코드들입니다. import numpy as np import pandas as pd from scipy import stats from sklearn.preprocessing import StandardScaler import FinanceDataReader as fdr import yfinance as yf import matplotlib.pyplot as plt import seaborn as sns sns.set_style("darkgrid") #fig 421 st=pd.Timestamp(2022,12, 1) et=pd.Timestamp(2023, 4, 1) da=fdr.DataReader("KS11", st, et) pop=(da['Close']-da['Open'])/da['Open']*100 np.random.seed(1) xBar=np.array([]) for i in range(20): x=pop.sample(5, random_state=i) xBar=np.append(xBar, x.mean()) xBar2=np.array([]) for i in range(100): x=pop.sample(5, random_state=i) xBar2=np.append(xBar2, x.mean()) fig, ax=plt.subplots(1, 2, figsize=(8, 3)) ax[0].hist(xBar, 10, rwidth=0.8, color="blue", label="n=20") ax[0].axvline(xBar.mean(), 0, 0.9, color="red", label=f"m={round(xBar.mean(), 2)}") ax[0].set_xlabel("x", weight=...

확률, 통계 관련 그래프

다음 그래프들은 전자책 파이썬과 함께하는 통계이야기 3 장에 수록된 그림들의 코드들입니다. import numpy as np import pandas as pd from scipy import stats import matplotlib.pyplot as plt import seaborn as sns sns.set_style("darkgrid") #fig 311 uni, p=[0, 1, 2],[0.25, 0.5, 0.25] fig, ax=plt.subplots(figsize=(4,3)) ax.bar(uni, p, color="g", alpha=0.3) ax.set_xlabel("# of head(H)") ax.set_ylabel("PMF") ax.set_yticks(p) ax.set_xticks(uni) plt.show() #fig 312 import itertools ca=list(itertools.product(range(1, 7), repeat=2)) S=[sum(i) for i in ca] uni, fre=np.unique(S, return_counts=True) fresum=sum(fre) p=[i/fresum for i in fre] fig, ax=plt.subplots(figsize=(4,3)) ax.bar(uni, p, color="g", alpha=0.3) ax.set_xlabel("Sum of number") ax.set_ylabel("PMF") ax.set_yticks(np.array(p).round(3)) ax.set_xticks(uni) plt.show() #fig 313 ca=list(itertools.product([0, 1], repeat=2)) S=[sum(i) for i in ca] uni, fre=np.unique(S, return_counts=True) re=pd.DataFrame([...