Caso de Estudio: Graficación de Precios de Bolsa con ChatGPT#

  • Ultima modificación: Feb 04, 2024 | YouTube

La url ‘https://raw.githubusercontent.com/jdvelasq/datalabs/master/datasets/precio_bolsa_nacional/csv/*.csv’ contiene los archivos de los precios horarios de bolsa para el mercado electrico colombiano. Construya una gráfica que contenga los precios promedios diarios y los precios promedios mensuales.

assets/repo_precios_bolsa.png

Paso 1#

Prompt: I want you to act as a python programmer assistant.

Write the code to obtain the filenames in a github repo with owner=”jdvelasq”, repo=”datalabs”, and folder=”datasets/precio_bolsa_nacional/csv/”.

Do not explain the code.

assets/chatgpt-01.png

[1]:
import requests

owner = "jdvelasq"
repo = "datalabs"
folder = "datasets/precio_bolsa_nacional/csv/"

url = f"https://api.github.com/repos/{owner}/{repo}/contents/{folder}"
response = requests.get(url)

filenames = [file["name"] for file in response.json()]
print(filenames)
['199507.csv', '199508.csv', '199509.csv', '199510.csv', '199511.csv', '199512.csv', '199601.csv', '199602.csv', '199603.csv', '199604.csv', '199605.csv', '199606.csv', '199607.csv', '199608.csv', '199609.csv', '199610.csv', '199611.csv', '199612.csv', '199701.csv', '199702.csv', '199703.csv', '199704.csv', '199705.csv', '199706.csv', '199707.csv', '199708.csv', '199709.csv', '199710.csv', '199711.csv', '199712.csv', '199801.csv', '199802.csv', '199803.csv', '199804.csv', '199805.csv', '199806.csv', '199807.csv', '199808.csv', '199809.csv', '199810.csv', '199811.csv', '199812.csv', '199901.csv', '199902.csv', '199903.csv', '199904.csv', '199905.csv', '199906.csv', '199907.csv', '199908.csv', '199909.csv', '199910.csv', '199911.csv', '199912.csv', '200001.csv', '200002.csv', '200003.csv', '200004.csv', '200005.csv', '200006.csv', '200007.csv', '200008.csv', '200009.csv', '200010.csv', '200011.csv', '200012.csv', '200101.csv', '200102.csv', '200103.csv', '200104.csv', '200105.csv', '200106.csv', '200107.csv', '200108.csv', '200109.csv', '200110.csv', '200111.csv', '200112.csv', '200201.csv', '200202.csv', '200203.csv', '200204.csv', '200205.csv', '200206.csv', '200207.csv', '200208.csv', '200209.csv', '200210.csv', '200211.csv', '200212.csv', '200301.csv', '200302.csv', '200303.csv', '200304.csv', '200305.csv', '200306.csv', '200307.csv', '200308.csv', '200309.csv', '200310.csv', '200311.csv', '200312.csv', '200401.csv', '200402.csv', '200403.csv', '200404.csv', '200405.csv', '200406.csv', '200407.csv', '200408.csv', '200409.csv', '200410.csv', '200411.csv', '200412.csv', '200501.csv', '200502.csv', '200503.csv', '200504.csv', '200505.csv', '200506.csv', '200507.csv', '200508.csv', '200509.csv', '200510.csv', '200511.csv', '200512.csv', '200601.csv', '200602.csv', '200603.csv', '200604.csv', '200605.csv', '200606.csv', '200607.csv', '200608.csv', '200609.csv', '200610.csv', '200611.csv', '200612.csv', '200701.csv', '200702.csv', '200703.csv', '200704.csv', '200705.csv', '200706.csv', '200707.csv', '200708.csv', '200709.csv', '200710.csv', '200711.csv', '200712.csv', '200801.csv', '200802.csv', '200803.csv', '200804.csv', '200805.csv', '200806.csv', '200807.csv', '200808.csv', '200809.csv', '200810.csv', '200811.csv', '200812.csv', '200901.csv', '200902.csv', '200903.csv', '200904.csv', '200905.csv', '200906.csv', '200907.csv', '200908.csv', '200909.csv', '200910.csv', '200911.csv', '200912.csv', '201001.csv', '201002.csv', '201003.csv', '201004.csv', '201005.csv', '201006.csv', '201007.csv', '201008.csv', '201009.csv', '201010.csv', '201011.csv', '201012.csv', '201101.csv', '201102.csv', '201103.csv', '201104.csv', '201105.csv', '201106.csv', '201107.csv', '201108.csv', '201109.csv', '201110.csv', '201111.csv', '201112.csv', '201201.csv', '201202.csv', '201203.csv', '201204.csv', '201205.csv', '201206.csv', '201207.csv', '201208.csv', '201209.csv', '201210.csv', '201211.csv', '201212.csv', '201301.csv', '201302.csv', '201303.csv', '201304.csv', '201305.csv', '201306.csv', '201307.csv', '201308.csv', '201309.csv', '201310.csv', '201311.csv', '201312.csv', '201401.csv', '201402.csv', '201403.csv', '201404.csv', '201405.csv', '201406.csv', '201407.csv', '201408.csv', '201409.csv', '201410.csv', '201411.csv', '201412.csv', '201501.csv', '201502.csv', '201503.csv', '201504.csv', '201505.csv', '201506.csv', '201507.csv', '201508.csv', '201509.csv', '201510.csv', '201511.csv', '201512.csv', '201601.csv', '201602.csv', '201603.csv', '201604.csv', '201605.csv', '201606.csv', '201607.csv', '201608.csv', '201609.csv', '201610.csv', '201611.csv', '201612.csv', '201701.csv', '201702.csv', '201703.csv', '201704.csv', '201705.csv', '201706.csv', '201707.csv', '201708.csv', '201709.csv', '201710.csv', '201711.csv', '201712.csv', '201801.csv', '201802.csv', '201803.csv', '201804.csv', '201805.csv', '201806.csv', '201807.csv', '201808.csv', '201809.csv', '201810.csv', '201811.csv', '201812.csv', '201901.csv', '201902.csv', '201903.csv', '201904.csv', '201905.csv', '201906.csv', '201907.csv', '201908.csv', '201909.csv', '201910.csv', '201911.csv', '201912.csv', '202001.csv', '202002.csv', '202003.csv', '202004.csv', '202005.csv', '202006.csv', '202007.csv', '202008.csv', '202009.csv', '202010.csv', '202011.csv', '202012.csv', '202101.csv', '202102.csv', '202103.csv', '202104.csv']

Paso 2#

Prompt: I want you to act as a python programmer assistant.

Write the code to read with pandas the CSV filenames in the master branch of the github repo with owner=”jdvelasq”, repo=”datalabs”, and folder=”datasets/precio_bolsa_nacional/csv”. Next, concatenate the dataframes.

Do not explain the code.

assets/chatgpt-02.png

[2]:
import pandas as pd

owner = "jdvelasq"
repo = "datalabs"
folder = "datasets/precio_bolsa_nacional/csv/"

url = f"https://api.github.com/repos/{owner}/{repo}/contents/{folder}"
response = requests.get(url)

base_url = f"https://raw.githubusercontent.com/{owner}/{repo}/master/{folder}"

dfs = []
for file in response.json():
    file_url = base_url + file["name"]
    df = pd.read_csv(file_url)
    dfs.append(df)

df_concat = pd.concat(dfs, ignore_index=True)
print(df_concat.head())
        Fecha      0      1      2     3     4      5      6      7      8
0  1995-07-21  1.073    1.0    1.0   1.0   1.0    1.0    5.0    6.0    6.0  \
1  1995-07-22  1.073  1.073    1.0   1.0   1.0  1.073  1.303  1.303  1.303
2  1995-07-23  1.073    1.0    1.0   1.0   1.0    1.0    0.1    1.0    1.0
3  1995-07-24    1.0    1.0   0.99   1.0   1.0  1.073    3.0    3.0    3.0
4  1995-07-25   0.99   0.99  0.989  0.99  0.99  1.073  1.263  1.263  1.263

   ...     14     15     16     17     18     19      20     21     22     23
0  ...    5.0    1.0    1.0    5.0   12.0  16.67  11.929    5.0    1.0    1.0
1  ...  1.073    1.0    1.0    1.0  1.303    2.5     2.5  1.303  1.073  1.073
2  ...    1.0    0.1    0.1    1.0  1.238  1.238   1.238  1.238  1.073    1.0
3  ...  1.073  1.073    3.0    2.0  18.63   22.5   9.256    3.0  1.073    1.0
4  ...  1.073  1.073  1.073  1.073  1.263    1.5   1.263  1.263  1.073   0.99

[5 rows x 25 columns]

Paso 3#

Prompt: I want you to act as a python programmer assistant.

The previous dataframe called “df_concat” contains the column “Fecha” represent a date with format “YYYY-MM-DD”, and the columns “0” to “23” representing the hours of a day. Write a code to generate a dataframe with columns “Fecha”, hour, and value.

Do not explain the code.

assets/chatgpt-03.png

[3]:
import pandas as pd

melted = pd.melt(df_concat, id_vars=["Fecha"], var_name="hour", value_name="value")
melted["hour"] = melted["hour"].astype(int)
print(melted.head())
        Fecha  hour  value
0  1995-07-21     0  1.073
1  1995-07-22     0  1.073
2  1995-07-23     0  1.073
3  1995-07-24     0    1.0
4  1995-07-25     0   0.99

Paso 4#

Prompt: I want you to act as a python programmer assistant.

Write a code that transform the elements in column “value” of the previous dataframe, when they are of type str. Take into account that the values of type str use “.” to represent thousands and “,” to represent decimals.

Do not explain the code.

assets/chatgpt-04.png

[4]:
def str_to_float(x):
    if isinstance(x, str):
        x = x.replace(".", "").replace(",", ".")
        x = float(x)
    return x


melted["value"] = melted["value"].apply(str_to_float)
print(melted.head())
        Fecha  hour  value
0  1995-07-21     0  1.073
1  1995-07-22     0  1.073
2  1995-07-23     0  1.073
3  1995-07-24     0  1.000
4  1995-07-25     0  0.990

Paso 5#

Prompt: I want you to act as a python programmer assistant.

Remove NA values from previous dataframe.

Do not explain the code.

assets/chatgpt-05.png

[5]:
melted.dropna(inplace=True)
print(melted.head())
        Fecha  hour  value
0  1995-07-21     0  1.073
1  1995-07-22     0  1.073
2  1995-07-23     0  1.073
3  1995-07-24     0  1.000
4  1995-07-25     0  0.990

Paso 6#

Prompt: I want you to act as a python programmer assistant.

Create a dataframe called “precios_diarios” as the mean values of the column “value” of the previous dataframe grouped by “Fecha” column.

Do not explain the code.

assets/chatgpt-06.png

[6]:
precios_diarios = melted.groupby("Fecha")["value"].mean().reset_index()
print(precios_diarios.head())
        Fecha     value
0  1995-07-21  4.924333
1  1995-07-22  1.269500
2  1995-07-23  0.953083
3  1995-07-24  4.305917
4  1995-07-25  1.149167

Paso 7#

Prompt: I want you to act as a python programmer assistant.

The column “Fecha” of the “melted” dataframe contains dates in the format “YYYY-MM-DD”. Create a dataframe called “precios_mensuales” as the mean values of the column “value” of the “melted” dataframe grouped by year and month, using the transform method. The new dataframe conservs the “Fecha” column.

Do not explain the code.

assets/chatgpt-07.png

[7]:
melted["year_month"] = melted["Fecha"].apply(lambda x: x[:-3])
precios_mensuales = melted.groupby("year_month")["value"].transform("mean")
precios_mensuales = pd.concat([melted["Fecha"], precios_mensuales], axis=1)
print(precios_mensuales.head())
        Fecha     value
0  1995-07-21  1.552087
1  1995-07-22  1.552087
2  1995-07-23  1.552087
3  1995-07-24  1.552087
4  1995-07-25  1.552087

Paso 8#

Prompt: I want you to act as a python programmer assistant.

Rename the column “value” in “precios_diarios” dataframe as “precio diario”. Rename the column “value” in “precios_mensuales” dataframe as “precio mensual”.

Do not explain the code.

assets/chatgpt-08.png

[8]:
precios_diarios = precios_diarios.rename(columns={"value": "precio diario"})
precios_mensuales = precios_mensuales.rename(columns={"value": "precio mensual"})

print(precios_diarios.head())
print(precios_mensuales.head())
        Fecha  precio diario
0  1995-07-21       4.924333
1  1995-07-22       1.269500
2  1995-07-23       0.953083
3  1995-07-24       4.305917
4  1995-07-25       1.149167
        Fecha  precio mensual
0  1995-07-21        1.552087
1  1995-07-22        1.552087
2  1995-07-23        1.552087
3  1995-07-24        1.552087
4  1995-07-25        1.552087

Paso 9#

Prompt: I want you to act as a python programmer assistant.

Create a time plot using matplotlib with the column “precio diario” of “precios_diarios” dataframe, and the column “precio mensual” of “precios_mensuales” dataframe. FixedFormatter should only used with FixedLocator.

Do not explain the code.

assets/chatgpt-09.png

[9]:
import matplotlib.pyplot as plt

precios_diarios["Fecha"] = pd.to_datetime(precios_diarios["Fecha"])
precios_mensuales["Fecha"] = pd.to_datetime(precios_mensuales["Fecha"])

precios_diarios = precios_diarios.set_index("Fecha")
precios_mensuales = precios_mensuales.set_index("Fecha")

fig, ax = plt.subplots(figsize=(10, 4))
ax.plot(precios_diarios["precio diario"], label="Precio diario")
ax.plot(precios_mensuales["precio mensual"], label="Precio mensual")

years = pd.date_range(
    start=precios_diarios.index.min(), end=precios_diarios.index.max(), freq="YS"
)
ax.set_xticks(years)
ax.set_xticklabels(years.strftime("%Y"), rotation=45, ha="right")

ax.set_ylabel("Precio ($/kWh)")
ax.legend()

plt.show()
../../_images/01_matplotlib_notebooks_14_tutorial_precios_bolsa_chatgpt_38_0.png