{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Regresión Lineal\n",
    "\n",
    "Ecuación de Regresión:\n",
    "\\begin{equation}\n",
    "Y_i = \\beta_0 + \\beta_1 X_i + \\epsilon_i\n",
    "\\end{equation}\n",
    "\n",
    "\n",
    "Ecuación de la Pendiente:\n",
    "\\begin{equation}\n",
    "\\hat{\\beta}_1 = \\frac{(X_i - \\bar{X})} {(Y_i - \\bar{Y})}\n",
    "\\end{equation}\n",
    "\n",
    "Este ejercicio se a adaptado de \"Linear Regression in Julia\" por Silaparasetty, V.\n",
    "\n",
    "[Descargar una muestra de los precios de acciones New York Stock Exchange](https://raw.githubusercontent.com/fernanvilla/data/main/nystocks.csv)\n",
    "\n",
    "[El conjunto completo de datos de precios](https://www.kaggle.com/dgawlik/nyse)\n",
    "    \n",
    "[Otro Ejemplo Recomendado de Regresión Lineal](https://www.machinelearningplus.com/linear-regression-in-julia/)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[32m\u001b[1m  Resolving\u001b[22m\u001b[39m package versions...\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Project.toml`\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Manifest.toml`\n",
      "\u001b[32m\u001b[1m  Resolving\u001b[22m\u001b[39m package versions...\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Project.toml`\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Manifest.toml`\n",
      "\u001b[32m\u001b[1m  Resolving\u001b[22m\u001b[39m package versions...\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Project.toml`\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Manifest.toml`\n",
      "\u001b[32m\u001b[1m  Resolving\u001b[22m\u001b[39m package versions...\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Project.toml`\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Manifest.toml`\n",
      "\u001b[32m\u001b[1m  Resolving\u001b[22m\u001b[39m package versions...\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Project.toml`\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Manifest.toml`\n",
      "\u001b[32m\u001b[1m  Resolving\u001b[22m\u001b[39m package versions...\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Project.toml`\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Manifest.toml`\n",
      "\u001b[32m\u001b[1m  Resolving\u001b[22m\u001b[39m package versions...\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Project.toml`\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Manifest.toml`\n",
      "\u001b[32m\u001b[1m  Resolving\u001b[22m\u001b[39m package versions...\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Project.toml`\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Manifest.toml`\n",
      "\u001b[32m\u001b[1m  Resolving\u001b[22m\u001b[39m package versions...\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Project.toml`\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Manifest.toml`\n",
      "\u001b[32m\u001b[1m  Resolving\u001b[22m\u001b[39m package versions...\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Project.toml`\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Manifest.toml`\n",
      "\u001b[32m\u001b[1m  Resolving\u001b[22m\u001b[39m package versions...\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Project.toml`\n",
      "\u001b[32m\u001b[1mNo Changes\u001b[22m\u001b[39m to `C:\\Users\\Fernan\\.julia\\environments\\v1.5\\Manifest.toml`\n",
      "\u001b[32m\u001b[1m   Building\u001b[22m\u001b[39m CodecZlib → `C:\\Users\\Fernan\\.julia\\packages\\CodecZlib\\5t9zO\\deps\\build.log`\n"
     ]
    }
   ],
   "source": [
    "# Import Packages\n",
    "using Pkg  # Package to install new packages\n",
    "\n",
    "# Install packages \n",
    "Pkg.add(\"DataFrames\")\n",
    "Pkg.add(\"CSV\")\n",
    "Pkg.add(\"CSVFiles\")\n",
    "Pkg.add(\"Plots\")\n",
    "Pkg.add(\"Lathe\")\n",
    "Pkg.add(\"GLM\")\n",
    "Pkg.add(\"StatsPlots\")\n",
    "Pkg.add(\"MLBase\")\n",
    "Pkg.add(\"Missings\")\n",
    "Pkg.add(\"Statistics\")\n",
    "Pkg.add(\"Plots\")\n",
    "Pkg.build(\"CodecZlib\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cargar los paquetes instalados\n",
    "using DataFrames\n",
    "using CSV\n",
    "using CSVFiles\n",
    "using Plots\n",
    "using Lathe\n",
    "using GLM\n",
    "using Statistics\n",
    "using StatsPlots\n",
    "using MLBase\n",
    "using Missings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "5×7 DataFrame\n",
      "│ Row │ date       │ symbol │ open    │ close   │ low     │ high    │ volume    │\n",
      "│     │ \u001b[90mString\u001b[39m     │ \u001b[90mString\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mInt64\u001b[39m     │\n",
      "├─────┼────────────┼────────┼─────────┼─────────┼─────────┼─────────┼───────────┤\n",
      "│ 1   │ 04-01-2010 │ A      │ 31.39   │ 31.3    │ 31.13   │ 31.63   │ 3815500   │\n",
      "│ 2   │ 04-01-2010 │ AAP    │ 40.7    │ 40.38   │ 40.36   │ 41.04   │ 1701700   │\n",
      "│ 3   │ 04-01-2010 │ AAPL   │ 213.43  │ 214.01  │ 212.38  │ 214.5   │ 123432400 │\n",
      "│ 4   │ 04-01-2010 │ ABC    │ 26.29   │ 26.63   │ 26.14   │ 26.69   │ 2455900   │\n",
      "│ 5   │ 04-01-2010 │ ABT    │ 54.19   │ 54.46   │ 53.92   │ 54.56   │ 10829000  │\n"
     ]
    }
   ],
   "source": [
    "# Carga el archivo CSV en un DataFrame\n",
    "# para más detalles consultar -> https://juliapackages.com/p/csvfiles\n",
    "\n",
    "using CSVFiles, DataFrames\n",
    "\n",
    "df = DataFrame(load(\"./Downloads/nystocks.csv\"))\n",
    "\n",
    "println(first(df,5))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exploración de los Datos"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "7-element Array{String,1}:\n",
       " \"date\"\n",
       " \"symbol\"\n",
       " \"open\"\n",
       " \"close\"\n",
       " \"low\"\n",
       " \"high\"\n",
       " \"volume\""
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Variables Disponibles\n",
    "names(df)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "
|  | date | symbol | open | close | low | high | volume | 
|---|
|  | String | String | Float64 | Float64 | Float64 | Float64 | Int64 | 
|---|
5 rows × 7 columns
| 1 | 04-01-2010 | A | 31.39 | 31.3 | 31.13 | 31.63 | 3815500 | 
|---|
| 2 | 04-01-2010 | AAP | 40.7 | 40.38 | 40.36 | 41.04 | 1701700 | 
|---|
| 3 | 04-01-2010 | AAPL | 213.43 | 214.01 | 212.38 | 214.5 | 123432400 | 
|---|
| 4 | 04-01-2010 | ABC | 26.29 | 26.63 | 26.14 | 26.69 | 2455900 | 
|---|
| 5 | 04-01-2010 | ABT | 54.19 | 54.46 | 53.92 | 54.56 | 10829000 | 
|---|
"
      ],
      "text/latex": [
       "\\begin{tabular}{r|ccccccc}\n",
       "\t& date & symbol & open & close & low & high & volume\\\\\n",
       "\t\\hline\n",
       "\t& String & String & Float64 & Float64 & Float64 & Float64 & Int64\\\\\n",
       "\t\\hline\n",
       "\t1 & 04-01-2010 & A & 31.39 & 31.3 & 31.13 & 31.63 & 3815500 \\\\\n",
       "\t2 & 04-01-2010 & AAP & 40.7 & 40.38 & 40.36 & 41.04 & 1701700 \\\\\n",
       "\t3 & 04-01-2010 & AAPL & 213.43 & 214.01 & 212.38 & 214.5 & 123432400 \\\\\n",
       "\t4 & 04-01-2010 & ABC & 26.29 & 26.63 & 26.14 & 26.69 & 2455900 \\\\\n",
       "\t5 & 04-01-2010 & ABT & 54.19 & 54.46 & 53.92 & 54.56 & 10829000 \\\\\n",
       "\\end{tabular}\n"
      ],
      "text/plain": [
       "5×7 DataFrame. Omitted printing of 1 columns\n",
       "│ Row │ date       │ symbol │ open    │ close   │ low     │ high    │\n",
       "│     │ \u001b[90mString\u001b[39m     │ \u001b[90mString\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │\n",
       "├─────┼────────────┼────────┼─────────┼─────────┼─────────┼─────────┤\n",
       "│ 1   │ 04-01-2010 │ A      │ 31.39   │ 31.3    │ 31.13   │ 31.63   │\n",
       "│ 2   │ 04-01-2010 │ AAP    │ 40.7    │ 40.38   │ 40.36   │ 41.04   │\n",
       "│ 3   │ 04-01-2010 │ AAPL   │ 213.43  │ 214.01  │ 212.38  │ 214.5   │\n",
       "│ 4   │ 04-01-2010 │ ABC    │ 26.29   │ 26.63   │ 26.14   │ 26.69   │\n",
       "│ 5   │ 04-01-2010 │ ABT    │ 54.19   │ 54.46   │ 53.92   │ 54.56   │"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Presentar las primeras 5 filas\n",
    "first(df,5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "|  | date | symbol | open | close | low | high | volume | 
|---|
|  | String | String | Float64 | Float64 | Float64 | Float64 | Int64 | 
|---|
5 rows × 7 columns
| 1 | 06-01-2010 | BMY | 25.17 | 25.22 | 25.07 | 25.29 | 15528900 | 
|---|
| 2 | 06-01-2010 | BSX | 9.07 | 9.16 | 8.99 | 9.28 | 12923000 | 
|---|
| 3 | 06-01-2010 | BWA | 35.39 | 36.69 | 35.3 | 36.78 | 4171000 | 
|---|
| 4 | 06-01-2010 | BXP | 68.23 | 68.44 | 68.03 | 68.94 | 1814900 | 
|---|
| 5 | 06-01-2010 | C | 3.56 | 3.64 | 3.51 | 3.68 | 67433800 | 
|---|
"
      ],
      "text/latex": [
       "\\begin{tabular}{r|ccccccc}\n",
       "\t& date & symbol & open & close & low & high & volume\\\\\n",
       "\t\\hline\n",
       "\t& String & String & Float64 & Float64 & Float64 & Float64 & Int64\\\\\n",
       "\t\\hline\n",
       "\t1 & 06-01-2010 & BMY & 25.17 & 25.22 & 25.07 & 25.29 & 15528900 \\\\\n",
       "\t2 & 06-01-2010 & BSX & 9.07 & 9.16 & 8.99 & 9.28 & 12923000 \\\\\n",
       "\t3 & 06-01-2010 & BWA & 35.39 & 36.69 & 35.3 & 36.78 & 4171000 \\\\\n",
       "\t4 & 06-01-2010 & BXP & 68.23 & 68.44 & 68.03 & 68.94 & 1814900 \\\\\n",
       "\t5 & 06-01-2010 & C & 3.56 & 3.64 & 3.51 & 3.68 & 67433800 \\\\\n",
       "\\end{tabular}\n"
      ],
      "text/plain": [
       "5×7 DataFrame\n",
       "│ Row │ date       │ symbol │ open    │ close   │ low     │ high    │ volume   │\n",
       "│     │ \u001b[90mString\u001b[39m     │ \u001b[90mString\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mInt64\u001b[39m    │\n",
       "├─────┼────────────┼────────┼─────────┼─────────┼─────────┼─────────┼──────────┤\n",
       "│ 1   │ 06-01-2010 │ BMY    │ 25.17   │ 25.22   │ 25.07   │ 25.29   │ 15528900 │\n",
       "│ 2   │ 06-01-2010 │ BSX    │ 9.07    │ 9.16    │ 8.99    │ 9.28    │ 12923000 │\n",
       "│ 3   │ 06-01-2010 │ BWA    │ 35.39   │ 36.69   │ 35.3    │ 36.78   │ 4171000  │\n",
       "│ 4   │ 06-01-2010 │ BXP    │ 68.23   │ 68.44   │ 68.03   │ 68.94   │ 1814900  │\n",
       "│ 5   │ 06-01-2010 │ C      │ 3.56    │ 3.64    │ 3.51    │ 3.68    │ 67433800 │"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Las últimas 5 filas\n",
    "last(df,5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "|  | variable | mean | min | median | max | nunique | nmissing | eltype | 
|---|
|  | Symbol | Union… | Any | Union… | Any | Union… | Nothing | DataType | 
|---|
7 rows × 8 columns
| 1 | date |  | 04-01-2010 |  | 06-01-2010 | 5 |  | String | 
|---|
| 2 | symbol |  | A |  | ZION | 467 |  | String | 
|---|
| 3 | open | 46.9074 | 1.53 | 37.07 | 627.181 |  |  | Float64 | 
|---|
| 4 | close | 47.0407 | 1.61 | 37.25 | 626.751 |  |  | Float64 | 
|---|
| 5 | low | 46.4453 | 1.51 | 36.74 | 624.241 |  |  | Float64 | 
|---|
| 6 | high | 47.4197 | 1.61 | 37.76 | 629.511 |  |  | Float64 | 
|---|
| 7 | volume | 7.01361e6 | 10000 | 3.0912e6 | 215620200 |  |  | Int64 | 
|---|
"
      ],
      "text/latex": [
       "\\begin{tabular}{r|cccccccc}\n",
       "\t& variable & mean & min & median & max & nunique & nmissing & eltype\\\\\n",
       "\t\\hline\n",
       "\t& Symbol & Union… & Any & Union… & Any & Union… & Nothing & DataType\\\\\n",
       "\t\\hline\n",
       "\t1 & date &  & 04-01-2010 &  & 06-01-2010 & 5 &  & String \\\\\n",
       "\t2 & symbol &  & A &  & ZION & 467 &  & String \\\\\n",
       "\t3 & open & 46.9074 & 1.53 & 37.07 & 627.181 &  &  & Float64 \\\\\n",
       "\t4 & close & 47.0407 & 1.61 & 37.25 & 626.751 &  &  & Float64 \\\\\n",
       "\t5 & low & 46.4453 & 1.51 & 36.74 & 624.241 &  &  & Float64 \\\\\n",
       "\t6 & high & 47.4197 & 1.61 & 37.76 & 629.511 &  &  & Float64 \\\\\n",
       "\t7 & volume & 7.01361e6 & 10000 & 3.0912e6 & 215620200 &  &  & Int64 \\\\\n",
       "\\end{tabular}\n"
      ],
      "text/plain": [
       "7×8 DataFrame. Omitted printing of 2 columns\n",
       "│ Row │ variable │ mean      │ min        │ median   │ max        │ nunique │\n",
       "│     │ \u001b[90mSymbol\u001b[39m   │ \u001b[90mUnion…\u001b[39m    │ \u001b[90mAny\u001b[39m        │ \u001b[90mUnion…\u001b[39m   │ \u001b[90mAny\u001b[39m        │ \u001b[90mUnion…\u001b[39m  │\n",
       "├─────┼──────────┼───────────┼────────────┼──────────┼────────────┼─────────┤\n",
       "│ 1   │ date     │           │ 04-01-2010 │          │ 06-01-2010 │ 5       │\n",
       "│ 2   │ symbol   │           │ A          │          │ ZION       │ 467     │\n",
       "│ 3   │ open     │ 46.9074   │ 1.53       │ 37.07    │ 627.181    │         │\n",
       "│ 4   │ close    │ 47.0407   │ 1.61       │ 37.25    │ 626.751    │         │\n",
       "│ 5   │ low      │ 46.4453   │ 1.51       │ 36.74    │ 624.241    │         │\n",
       "│ 6   │ high     │ 47.4197   │ 1.61       │ 37.76    │ 629.511    │         │\n",
       "│ 7   │ volume   │ 7.01361e6 │ 10000      │ 3.0912e6 │ 215620200  │         │"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Algunos Indicadores Estadísticos\n",
    "describe(df)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "GroupedDataFrame with 467 groups based on key: symbol
First Group (3 rows): symbol = \"A\"
|  | date | symbol | open | close | low | high | volume | 
|---|
|  | String | String | Float64 | Float64 | Float64 | Float64 | Int64 | 
|---|
| 1 | 04-01-2010 | A | 31.39 | 31.3 | 31.13 | 31.63 | 3815500 | 
|---|
| 2 | 05-01-2010 | A | 31.21 | 30.96 | 30.76 | 31.22 | 4186000 | 
|---|
| 3 | 06-01-2010 | A | 30.85 | 30.85 | 30.76 | 31.0 | 3243700 | 
|---|
⋮
Last Group (1 row): symbol = \"CHTR\"
|  | date | symbol | open | close | low | high | volume | 
|---|
|  | String | String | Float64 | Float64 | Float64 | Float64 | Int64 | 
|---|
| 1 | 05-01-2010 | CHTR | 35.0 | 35.0 | 35.0 | 35.0 | 10000 | 
|---|
"
      ],
      "text/latex": [
       "GroupedDataFrame with 467 groups based on key: symbol\n",
       "\n",
       "First Group (3 rows): symbol = \"A\"\n",
       "\n",
       "\\begin{tabular}{r|ccccccc}\n",
       "\t& date & symbol & open & close & low & high & volume\\\\\n",
       "\t\\hline\n",
       "\t& String & String & Float64 & Float64 & Float64 & Float64 & Int64\\\\\n",
       "\t\\hline\n",
       "\t1 & 04-01-2010 & A & 31.39 & 31.3 & 31.13 & 31.63 & 3815500 \\\\\n",
       "\t2 & 05-01-2010 & A & 31.21 & 30.96 & 30.76 & 31.22 & 4186000 \\\\\n",
       "\t3 & 06-01-2010 & A & 30.85 & 30.85 & 30.76 & 31.0 & 3243700 \\\\\n",
       "\\end{tabular}\n",
       "\n",
       "$\\dots$\n",
       "\n",
       "Last Group (1 row): symbol = \"CHTR\"\n",
       "\n",
       "\\begin{tabular}{r|ccccccc}\n",
       "\t& date & symbol & open & close & low & high & volume\\\\\n",
       "\t\\hline\n",
       "\t& String & String & Float64 & Float64 & Float64 & Float64 & Int64\\\\\n",
       "\t\\hline\n",
       "\t1 & 05-01-2010 & CHTR & 35.0 & 35.0 & 35.0 & 35.0 & 10000 \\\\\n",
       "\\end{tabular}\n"
      ],
      "text/plain": [
       "GroupedDataFrame with 467 groups based on key: symbol\n",
       "First Group (3 rows): symbol = \"A\"\n",
       "│ Row │ date       │ symbol │ open    │ close   │ low     │ high    │ volume  │\n",
       "│     │ \u001b[90mString\u001b[39m     │ \u001b[90mString\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mInt64\u001b[39m   │\n",
       "├─────┼────────────┼────────┼─────────┼─────────┼─────────┼─────────┼─────────┤\n",
       "│ 1   │ 04-01-2010 │ A      │ 31.39   │ 31.3    │ 31.13   │ 31.63   │ 3815500 │\n",
       "│ 2   │ 05-01-2010 │ A      │ 31.21   │ 30.96   │ 30.76   │ 31.22   │ 4186000 │\n",
       "│ 3   │ 06-01-2010 │ A      │ 30.85   │ 30.85   │ 30.76   │ 31.0    │ 3243700 │\n",
       "⋮\n",
       "Last Group (1 row): symbol = \"CHTR\"\n",
       "│ Row │ date       │ symbol │ open    │ close   │ low     │ high    │ volume │\n",
       "│     │ \u001b[90mString\u001b[39m     │ \u001b[90mString\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mInt64\u001b[39m  │\n",
       "├─────┼────────────┼────────┼─────────┼─────────┼─────────┼─────────┼────────┤\n",
       "│ 1   │ 05-01-2010 │ CHTR   │ 35.0    │ 35.0    │ 35.0    │ 35.0    │ 10000  │"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Separar por grupos\n",
    "agrupar = groupby(df, :symbol)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SubDataFrame{DataFrame,DataFrames.Index,Array{Int64,1}}\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "|  | date | symbol | open | close | low | high | volume | 
|---|
|  | String | String | Float64 | Float64 | Float64 | Float64 | Int64 | 
|---|
3 rows × 7 columns
| 1 | 04-01-2010 | BMY | 25.41 | 25.63 | 25.3 | 25.7 | 14376100 | 
|---|
| 2 | 05-01-2010 | BMY | 25.51 | 25.23 | 25.01 | 25.55 | 16973600 | 
|---|
| 3 | 06-01-2010 | BMY | 25.17 | 25.22 | 25.07 | 25.29 | 15528900 | 
|---|
"
      ],
      "text/latex": [
       "\\begin{tabular}{r|ccccccc}\n",
       "\t& date & symbol & open & close & low & high & volume\\\\\n",
       "\t\\hline\n",
       "\t& String & String & Float64 & Float64 & Float64 & Float64 & Int64\\\\\n",
       "\t\\hline\n",
       "\t1 & 04-01-2010 & BMY & 25.41 & 25.63 & 25.3 & 25.7 & 14376100 \\\\\n",
       "\t2 & 05-01-2010 & BMY & 25.51 & 25.23 & 25.01 & 25.55 & 16973600 \\\\\n",
       "\t3 & 06-01-2010 & BMY & 25.17 & 25.22 & 25.07 & 25.29 & 15528900 \\\\\n",
       "\\end{tabular}\n"
      ],
      "text/plain": [
       "3×7 SubDataFrame\n",
       "│ Row │ date       │ symbol │ open    │ close   │ low     │ high    │ volume   │\n",
       "│     │ \u001b[90mString\u001b[39m     │ \u001b[90mString\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mInt64\u001b[39m    │\n",
       "├─────┼────────────┼────────┼─────────┼─────────┼─────────┼─────────┼──────────┤\n",
       "│ 1   │ 04-01-2010 │ BMY    │ 25.41   │ 25.63   │ 25.3    │ 25.7    │ 14376100 │\n",
       "│ 2   │ 05-01-2010 │ BMY    │ 25.51   │ 25.23   │ 25.01   │ 25.55   │ 16973600 │\n",
       "│ 3   │ 06-01-2010 │ BMY    │ 25.17   │ 25.22   │ 25.07   │ 25.29   │ 15528900 │"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Obtener un grupo\n",
    "losBXP = get(agrupar, (symbol=:\"BMY\",), nothing)\n",
    "println(typeof(losBXP))\n",
    "losBXP\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SubDataFrame{DataFrame,DataFrames.Index,Array{Int64,1}}\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "|  | date | symbol | open | close | low | high | volume | 
|---|
|  | String | String | Float64 | Float64 | Float64 | Float64 | Int64 | 
|---|
3 rows × 7 columns
| 1 | 04-01-2010 | BXP | 67.59 | 67.1 | 66.53 | 68.33 | 1511500 | 
|---|
| 2 | 05-01-2010 | BXP | 67.24 | 68.12 | 66.45 | 68.2 | 2173700 | 
|---|
| 3 | 06-01-2010 | BXP | 68.23 | 68.44 | 68.03 | 68.94 | 1814900 | 
|---|
"
      ],
      "text/latex": [
       "\\begin{tabular}{r|ccccccc}\n",
       "\t& date & symbol & open & close & low & high & volume\\\\\n",
       "\t\\hline\n",
       "\t& String & String & Float64 & Float64 & Float64 & Float64 & Int64\\\\\n",
       "\t\\hline\n",
       "\t1 & 04-01-2010 & BXP & 67.59 & 67.1 & 66.53 & 68.33 & 1511500 \\\\\n",
       "\t2 & 05-01-2010 & BXP & 67.24 & 68.12 & 66.45 & 68.2 & 2173700 \\\\\n",
       "\t3 & 06-01-2010 & BXP & 68.23 & 68.44 & 68.03 & 68.94 & 1814900 \\\\\n",
       "\\end{tabular}\n"
      ],
      "text/plain": [
       "3×7 SubDataFrame\n",
       "│ Row │ date       │ symbol │ open    │ close   │ low     │ high    │ volume  │\n",
       "│     │ \u001b[90mString\u001b[39m     │ \u001b[90mString\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mInt64\u001b[39m   │\n",
       "├─────┼────────────┼────────┼─────────┼─────────┼─────────┼─────────┼─────────┤\n",
       "│ 1   │ 04-01-2010 │ BXP    │ 67.59   │ 67.1    │ 66.53   │ 68.33   │ 1511500 │\n",
       "│ 2   │ 05-01-2010 │ BXP    │ 67.24   │ 68.12   │ 66.45   │ 68.2    │ 2173700 │\n",
       "│ 3   │ 06-01-2010 │ BXP    │ 68.23   │ 68.44   │ 68.03   │ 68.94   │ 1814900 │"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Obtener un grupo\n",
    "losBXP = get(agrupar, (symbol=:\"BXP\",), nothing)\n",
    "println(typeof(losBXP))\n",
    "losBXP"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SubDataFrame{DataFrame,DataFrames.Index,Array{Int64,1}}\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "|  | date | symbol | open | close | low | high | volume | 
|---|
|  | String | String | Float64 | Float64 | Float64 | Float64 | Int64 | 
|---|
3 rows × 7 columns
| 1 | 04-01-2010 | BMY | 25.41 | 25.63 | 25.3 | 25.7 | 14376100 | 
|---|
| 2 | 05-01-2010 | BMY | 25.51 | 25.23 | 25.01 | 25.55 | 16973600 | 
|---|
| 3 | 06-01-2010 | BMY | 25.17 | 25.22 | 25.07 | 25.29 | 15528900 | 
|---|
"
      ],
      "text/latex": [
       "\\begin{tabular}{r|ccccccc}\n",
       "\t& date & symbol & open & close & low & high & volume\\\\\n",
       "\t\\hline\n",
       "\t& String & String & Float64 & Float64 & Float64 & Float64 & Int64\\\\\n",
       "\t\\hline\n",
       "\t1 & 04-01-2010 & BMY & 25.41 & 25.63 & 25.3 & 25.7 & 14376100 \\\\\n",
       "\t2 & 05-01-2010 & BMY & 25.51 & 25.23 & 25.01 & 25.55 & 16973600 \\\\\n",
       "\t3 & 06-01-2010 & BMY & 25.17 & 25.22 & 25.07 & 25.29 & 15528900 \\\\\n",
       "\\end{tabular}\n"
      ],
      "text/plain": [
       "3×7 SubDataFrame\n",
       "│ Row │ date       │ symbol │ open    │ close   │ low     │ high    │ volume   │\n",
       "│     │ \u001b[90mString\u001b[39m     │ \u001b[90mString\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mFloat64\u001b[39m │ \u001b[90mInt64\u001b[39m    │\n",
       "├─────┼────────────┼────────┼─────────┼─────────┼─────────┼─────────┼──────────┤\n",
       "│ 1   │ 04-01-2010 │ BMY    │ 25.41   │ 25.63   │ 25.3    │ 25.7    │ 14376100 │\n",
       "│ 2   │ 05-01-2010 │ BMY    │ 25.51   │ 25.23   │ 25.01   │ 25.55   │ 16973600 │\n",
       "│ 3   │ 06-01-2010 │ BMY    │ 25.17   │ 25.22   │ 25.07   │ 25.29   │ 15528900 │"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "losBMY = agrupar[(symbol= \"BMY\",)]\n",
    "println(typeof(losBMY))\n",
    "losBMY"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Relación entre los precios de apertura vs los de cierre"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/svg+xml": [
       "\n",
       "