RNN en TensorFlow/Keras: Predicción de Series Temporales
Red neuronal recurrente simple con TensorFlow/Keras para predicción del siguiente valor de una serie temporal sintética con múltiples frecuencias.
Fundamentos de RNN con TensorFlow/Keras: predicción de series temporales
En este notebook vamos a construir, entrenar y evaluar una Red Neuronal Recurrente (RNN) desde cero (usando tensorflow.keras) para una tarea clásica de datos secuenciales: predecir el siguiente valor de una serie temporal.
Objetivo didáctico
La idea es conectar lo visto en teoría del submódulo de fundamentos:
- qué son los datos secuenciales,
- por qué un MLP no modela bien el orden temporal por sí solo,
- cómo una RNN reutiliza un estado oculto para acumular contexto,
- qué significa entrenar una RNN con Backpropagation Through Time (BPTT),
- y por qué la longitud de las secuencias afecta al aprendizaje.
Trabajaremos con un dataset sintético (generado con fórmula + ruido), porque permite:
- controlar la dificultad del problema,
- entender mejor qué patrón temporal debe aprender el modelo,
- y visualizar claramente cuándo la RNN generaliza bien o mal.
Dataset que usaremos
Generaremos una serie temporal con tres componentes:
[ y_t = \underbrace{\sin(0.04t)}{\text{patrón lento}} + 0.5\underbrace{\sin(0.20t)}{\text{patrón rápido}} + 0.0015t + \epsilon_t ]
con (\epsilon_t \sim \mathcal{N}(0, \sigma^2)).
- La suma de senos crea una señal no trivial con frecuencias distintas.
- El término lineal añade una pequeña tendencia.
- El ruido simula mediciones realistas.
Fundamento matemático de la RNN (versión esencial)
Una RNN simple procesa una secuencia ((x_1, x_2, \dots, x_T)) paso a paso:
[ h_t = \phi(W_{xh}x_t + W_{hh}h_{t-1} + b_h) ]
[ \hat{y}t = W{hy}h_t + b_y ]
donde:
- (h_t) es el estado oculto (memoria),
- (\phi) suele ser
tanh, - (W_{hh}) permite que la red reutilice información previa.
Para nuestra tarea many-to-one (usar una ventana de tiempo para predecir el siguiente valor), tomaremos el último estado oculto para estimar (\hat{y}_{T+1}).
Qué vamos a implementar
- EDA básico de la serie (gráficas, estadísticas, autocorrelación).
- Creación de ventanas deslizantes (secuencias de entrada y target futuro).
- Modelo de referencia (baseline ingenuo).
- Modelo
SimpleRNNen Keras. - Entrenamiento con validación y early stopping.
- Curvas de aprendizaje (
lossyMAEen train/val). - Evaluación en test con métricas y visualizaciones.
- Análisis de errores y conclusiones.
Nota: mantenemos el notebook en un nivel fundacional: suficiente rigor para entender el mecanismo, sin entrar todavía en celdas LSTM/GRU ni arquitecturas profundas.
# ==============================
# 1) Importaciones y configuración
# ==============================
import os
# forzar ejecución en CPU (evita errores del autotuner XLA/Triton en GPU)
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
import random
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
def set_seed(seed=42):
# Fijamos semillas para mayor reproducibilidad
random.seed(seed)
np.random.seed(seed)
tf.random.set_seed(seed)
os.environ["PYTHONHASHSEED"] = str(seed)
set_seed(42)
sns.set(style="whitegrid", context="notebook")
plt.rcParams["figure.figsize"] = (12, 4)
print("TensorFlow:", tf.__version__)
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1773758295.830221 3607059 port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. I0000 00:00:1773758295.855704 3607059 cpu_feature_guard.cc:227] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
TensorFlow: 2.21.0
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1773758296.544670 3607059 port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
# ==============================================
# 2) Generación del dataset sintético secuencial
# ==============================================
n_points = 2500
t = np.arange(n_points)
signal_slow = np.sin(0.04 * t)
signal_fast = 0.5 * np.sin(0.20 * t)
trend = 0.0015 * t
noise = np.random.normal(0, 0.20, size=n_points)
y = signal_slow + signal_fast + trend + noise
df = pd.DataFrame({"t": t, "signal": y, "slow": signal_slow, "fast": signal_fast, "trend": trend})
df.head()
| t | signal | slow | fast | trend | |
|---|---|---|---|---|---|
| 0 | 0 | 0.099343 | 0.000000 | 0.000000 | 0.0000 |
| 1 | 1 | 0.113171 | 0.039989 | 0.099335 | 0.0015 |
| 2 | 2 | 0.407162 | 0.079915 | 0.194709 | 0.0030 |
| 3 | 3 | 0.711139 | 0.119712 | 0.282321 | 0.0045 |
| 4 | 4 | 0.477166 | 0.159318 | 0.358678 | 0.0060 |
# ==============================
# 3) EDA inicial de la serie
# ==============================
print("Dimensión:", df.shape)
display(df.describe().T)
print("\nValores nulos por columna:\n", df.isna().sum())
Dimensión: (2500, 5)
| count | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|
| t | 2500.0 | 1249.500000 | 721.832160 | 0.000000 | 624.750000 | 1249.500000 | 1874.250000 | 2499.000000 |
| signal | 2500.0 | 1.884405 | 1.327071 | -1.551169 | 0.897504 | 1.874144 | 2.857377 | 5.122747 |
| slow | 2500.0 | 0.001478 | 0.708752 | -1.000000 | -0.709604 | 0.007611 | 0.710100 | 1.000000 |
| fast | 2500.0 | 0.001924 | 0.353459 | -0.500000 | -0.351196 | 0.003545 | 0.354958 | 0.500000 |
| trend | 2500.0 | 1.874250 | 1.082748 | 0.000000 | 0.937125 | 1.874250 | 2.811375 | 3.748500 |
Valores nulos por columna: t 0 signal 0 slow 0 fast 0 trend 0 dtype: int64
fig, axes = plt.subplots(2, 1, figsize=(14, 8), sharex=True)
axes[0].plot(df["t"], df["signal"], color="royalblue", linewidth=1)
axes[0].set_title("Serie temporal objetivo (signal)")
axes[0].set_ylabel("Valor")
axes[1].plot(df["t"], df["slow"], label="Componente lenta", alpha=0.9)
axes[1].plot(df["t"], df["fast"], label="Componente rápida", alpha=0.9)
axes[1].plot(df["t"], df["trend"], label="Tendencia", alpha=0.9)
axes[1].set_title("Componentes generadoras (sin ruido)")
axes[1].set_xlabel("Tiempo")
axes[1].set_ylabel("Valor")
axes[1].legend()
plt.tight_layout(); plt.show()
max_lag = 60
lags = np.arange(1, max_lag + 1)
acorrs = []
x = df["signal"].values
x_centered = x - x.mean()
for lag in lags:
a = x_centered[:-lag]
b = x_centered[lag:]
acorrs.append(np.corrcoef(a, b)[0, 1])
plt.figure(figsize=(12, 4))
plt.stem(lags, acorrs, basefmt=" ")
plt.title("Autocorrelación aproximada por lag")
plt.xlabel("Lag")
plt.ylabel("Correlación")
plt.show()
Preparación para aprendizaje supervisado
Una RNN no entrena directamente sobre un único vector estático, sino sobre una secuencia de longitud fija (L).
Vamos a convertir la serie en pares ((X, y)):
- (X): ventana con los últimos (L) valores,
- (y): valor inmediatamente siguiente.
Esto implementa una tarea many-to-one:
[ (x_{t-L+1}, \dots, x_t) \rightarrow x_{t+1} ]
Además, separaremos cronológicamente en train/val/test (sin barajar), porque en series temporales no debemos mezclar futuro con pasado.
def make_windows(series, window_size=40):
"""Convierte una serie 1D en muestras (X, y) con ventana deslizante."""
X, y = [], []
for i in range(window_size, len(series)):
X.append(series[i - window_size:i])
y.append(series[i])
return np.array(X), np.array(y)
window_size = 40
X_all, y_all = make_windows(df["signal"].values, window_size=window_size)
X_all = X_all[..., np.newaxis]
n = len(X_all)
train_end = int(0.70 * n)
val_end = int(0.85 * n)
X_train, y_train = X_all[:train_end], y_all[:train_end]
X_val, y_val = X_all[train_end:val_end], y_all[train_end:val_end]
X_test, y_test = X_all[val_end:], y_all[val_end:]
print("X_all:", X_all.shape, "y_all:", y_all.shape)
print(f"Train: {X_train.shape}, Val: {X_val.shape}, Test: {X_test.shape}")
X_all: (2460, 40, 1) y_all: (2460,) Train: (1722, 40, 1), Val: (369, 40, 1), Test: (369, 40, 1)
idx = 120
sample_window = X_train[idx].squeeze()
sample_target = y_train[idx]
plt.figure(figsize=(10, 4))
plt.plot(range(window_size), sample_window, marker="o", markersize=3, label="Ventana de entrada")
plt.scatter(window_size, sample_target, color="crimson", s=60, label="Target siguiente")
plt.title("Ejemplo de muestra many-to-one")
plt.xlabel("Paso temporal relativo")
plt.ylabel("Valor")
plt.legend(); plt.show()
train_mean = X_train.mean(); train_std = X_train.std()
X_train_s = (X_train - train_mean) / train_std
X_val_s = (X_val - train_mean) / train_std
X_test_s = (X_test - train_mean) / train_std
y_train_s = (y_train - train_mean) / train_std
y_val_s = (y_val - train_mean) / train_std
y_test_s = (y_test - train_mean) / train_std
print("Media train escalada:", X_train_s.mean().round(4))
print("Std train escalada:", X_train_s.std().round(4))
Media train escalada: 0.0 Std train escalada: 1.0
Baseline ingenuo antes de entrenar la RNN
Antes de usar un modelo complejo, conviene comparar contra un baseline simple. Aquí usaremos el predictor persistente: para cada ventana, predice que el próximo valor será igual al último valor observado.
y_pred_naive = X_test[:, -1, 0]
mae_naive = mean_absolute_error(y_test, y_pred_naive)
rmse_naive = np.sqrt(mean_squared_error(y_test, y_pred_naive))
r2_naive = r2_score(y_test, y_pred_naive)
print("Baseline naive (test)")
print(f"MAE : {mae_naive:.4f}")
print(f"RMSE: {rmse_naive:.4f}")
print(f"R2 : {r2_naive:.4f}")
Baseline naive (test) MAE : 0.2324 RMSE: 0.2907 R2 : 0.8720
Modelo RNN en Keras
Arquitectura usada:
SimpleRNN(32, activation='tanh')Dense(16, activation='relu')Dense(1)para regresión.
model = keras.Sequential([
layers.Input(shape=(window_size, 1)),
layers.SimpleRNN(32, activation="tanh", name="rnn_fundamental"),
layers.Dense(16, activation="relu"),
layers.Dense(1)
])
model.compile(optimizer=keras.optimizers.Adam(learning_rate=1e-3), loss="mse", metrics=["mae"])
model.summary()
E0000 00:00:1773758297.031298 3607059 cuda_platform.cc:52] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected I0000 00:00:1773758297.031315 3607059 cuda_diagnostics.cc:160] env: CUDA_VISIBLE_DEVICES="-1" I0000 00:00:1773758297.031321 3607059 cuda_diagnostics.cc:163] CUDA_VISIBLE_DEVICES is set to -1 - this hides all GPUs from CUDA I0000 00:00:1773758297.031326 3607059 cuda_diagnostics.cc:171] verbose logging is disabled. Rerun with verbose logging (usually --v=1 or --vmodule=cuda_diagnostics=1) to get more diagnostic output from this module I0000 00:00:1773758297.031327 3607059 cuda_diagnostics.cc:176] retrieving CUDA diagnostic information for host: tnp01-4090 I0000 00:00:1773758297.031328 3607059 cuda_diagnostics.cc:183] hostname: tnp01-4090 I0000 00:00:1773758297.031409 3607059 cuda_diagnostics.cc:190] libcuda reported version is: 580.126.9 I0000 00:00:1773758297.031417 3607059 cuda_diagnostics.cc:194] kernel reported version is: 580.126.9 I0000 00:00:1773758297.031418 3607059 cuda_diagnostics.cc:284] kernel version seems to match DSO: 580.126.9
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ rnn_fundamental (SimpleRNN) │ (None, 32) │ 1,088 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense (Dense) │ (None, 16) │ 528 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 1) │ 17 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 1,633 (6.38 KB)
Trainable params: 1,633 (6.38 KB)
Non-trainable params: 0 (0.00 B)
early_stopping = keras.callbacks.EarlyStopping(monitor="val_loss", patience=10, restore_best_weights=True)
history = model.fit(
X_train_s, y_train_s,
validation_data=(X_val_s, y_val_s),
epochs=80,
batch_size=32,
callbacks=[early_stopping],
verbose=1,
)
Epoch 1/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - loss: 0.1690 - mae: 0.3075 - val_loss: 0.1123 - val_mae: 0.2664 Epoch 2/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - loss: 0.0661 - mae: 0.2064 - val_loss: 0.0822 - val_mae: 0.2283 Epoch 3/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0622 - mae: 0.2002 - val_loss: 0.0736 - val_mae: 0.2171 Epoch 4/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0595 - mae: 0.1958 - val_loss: 0.0694 - val_mae: 0.2112 Epoch 5/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0578 - mae: 0.1929 - val_loss: 0.0666 - val_mae: 0.2070 Epoch 6/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0565 - mae: 0.1909 - val_loss: 0.0645 - val_mae: 0.2036 Epoch 7/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0555 - mae: 0.1891 - val_loss: 0.0631 - val_mae: 0.2011 Epoch 8/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0547 - mae: 0.1874 - val_loss: 0.0622 - val_mae: 0.1992 Epoch 9/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0541 - mae: 0.1865 - val_loss: 0.0618 - val_mae: 0.1987 Epoch 10/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0533 - mae: 0.1847 - val_loss: 0.0623 - val_mae: 0.1993 Epoch 11/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0526 - mae: 0.1835 - val_loss: 0.0614 - val_mae: 0.1979 Epoch 12/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0520 - mae: 0.1826 - val_loss: 0.0580 - val_mae: 0.1922 Epoch 13/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0515 - mae: 0.1813 - val_loss: 0.0574 - val_mae: 0.1914 Epoch 14/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0513 - mae: 0.1811 - val_loss: 0.0568 - val_mae: 0.1908 Epoch 15/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0509 - mae: 0.1803 - val_loss: 0.0566 - val_mae: 0.1903 Epoch 16/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0503 - mae: 0.1794 - val_loss: 0.0575 - val_mae: 0.1923 Epoch 17/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0502 - mae: 0.1791 - val_loss: 0.0576 - val_mae: 0.1929 Epoch 18/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0501 - mae: 0.1789 - val_loss: 0.0573 - val_mae: 0.1926 Epoch 19/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0499 - mae: 0.1787 - val_loss: 0.0573 - val_mae: 0.1930 Epoch 20/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0497 - mae: 0.1783 - val_loss: 0.0568 - val_mae: 0.1922 Epoch 21/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0496 - mae: 0.1781 - val_loss: 0.0570 - val_mae: 0.1929 Epoch 22/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0493 - mae: 0.1777 - val_loss: 0.0564 - val_mae: 0.1918 Epoch 23/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0491 - mae: 0.1773 - val_loss: 0.0562 - val_mae: 0.1915 Epoch 24/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0489 - mae: 0.1769 - val_loss: 0.0560 - val_mae: 0.1912 Epoch 25/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0488 - mae: 0.1765 - val_loss: 0.0560 - val_mae: 0.1914 Epoch 26/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0487 - mae: 0.1763 - val_loss: 0.0559 - val_mae: 0.1914 Epoch 27/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0485 - mae: 0.1761 - val_loss: 0.0558 - val_mae: 0.1915 Epoch 28/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0483 - mae: 0.1757 - val_loss: 0.0558 - val_mae: 0.1918 Epoch 29/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0483 - mae: 0.1757 - val_loss: 0.0556 - val_mae: 0.1916 Epoch 30/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0481 - mae: 0.1755 - val_loss: 0.0556 - val_mae: 0.1915 Epoch 31/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0480 - mae: 0.1753 - val_loss: 0.0554 - val_mae: 0.1913 Epoch 32/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0477 - mae: 0.1750 - val_loss: 0.0551 - val_mae: 0.1909 Epoch 33/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0476 - mae: 0.1748 - val_loss: 0.0553 - val_mae: 0.1913 Epoch 34/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0476 - mae: 0.1750 - val_loss: 0.0548 - val_mae: 0.1902 Epoch 35/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0476 - mae: 0.1751 - val_loss: 0.0546 - val_mae: 0.1899 Epoch 36/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0475 - mae: 0.1750 - val_loss: 0.0546 - val_mae: 0.1901 Epoch 37/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0475 - mae: 0.1752 - val_loss: 0.0546 - val_mae: 0.1900 Epoch 38/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0474 - mae: 0.1751 - val_loss: 0.0541 - val_mae: 0.1891 Epoch 39/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0472 - mae: 0.1750 - val_loss: 0.0547 - val_mae: 0.1903 Epoch 40/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0472 - mae: 0.1751 - val_loss: 0.0537 - val_mae: 0.1880 Epoch 41/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0470 - mae: 0.1749 - val_loss: 0.0537 - val_mae: 0.1878 Epoch 42/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0469 - mae: 0.1748 - val_loss: 0.0538 - val_mae: 0.1881 Epoch 43/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - loss: 0.0468 - mae: 0.1749 - val_loss: 0.0536 - val_mae: 0.1873 Epoch 44/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0466 - mae: 0.1745 - val_loss: 0.0531 - val_mae: 0.1860 Epoch 45/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0464 - mae: 0.1742 - val_loss: 0.0526 - val_mae: 0.1843 Epoch 46/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0461 - mae: 0.1736 - val_loss: 0.0523 - val_mae: 0.1835 Epoch 47/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0461 - mae: 0.1737 - val_loss: 0.0521 - val_mae: 0.1828 Epoch 48/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0460 - mae: 0.1735 - val_loss: 0.0522 - val_mae: 0.1825 Epoch 49/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0458 - mae: 0.1731 - val_loss: 0.0526 - val_mae: 0.1838 Epoch 50/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0457 - mae: 0.1730 - val_loss: 0.0524 - val_mae: 0.1827 Epoch 51/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0457 - mae: 0.1729 - val_loss: 0.0524 - val_mae: 0.1820 Epoch 52/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0454 - mae: 0.1717 - val_loss: 0.0523 - val_mae: 0.1825 Epoch 53/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0450 - mae: 0.1711 - val_loss: 0.0523 - val_mae: 0.1822 Epoch 54/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0449 - mae: 0.1709 - val_loss: 0.0524 - val_mae: 0.1819 Epoch 55/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0450 - mae: 0.1711 - val_loss: 0.0527 - val_mae: 0.1820 Epoch 56/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0449 - mae: 0.1707 - val_loss: 0.0528 - val_mae: 0.1820 Epoch 57/80 [1m54/54[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.0447 - mae: 0.1703 - val_loss: 0.0527 - val_mae: 0.1819
hist_df = pd.DataFrame(history.history)
fig, axes = plt.subplots(1, 2, figsize=(14, 4))
axes[0].plot(hist_df.index + 1, hist_df["loss"], label="Train loss")
axes[0].plot(hist_df.index + 1, hist_df["val_loss"], label="Val loss")
axes[0].set_title("Loss (MSE) por época")
axes[0].set_xlabel("Época"); axes[0].set_ylabel("MSE"); axes[0].legend()
axes[1].plot(hist_df.index + 1, hist_df["mae"], label="Train MAE")
axes[1].plot(hist_df.index + 1, hist_df["val_mae"], label="Val MAE")
axes[1].set_title("MAE por época")
axes[1].set_xlabel("Época"); axes[1].set_ylabel("MAE"); axes[1].legend()
plt.tight_layout(); plt.show()
y_pred_test_s = model.predict(X_test_s).squeeze()
y_pred_test = y_pred_test_s * train_std + train_mean
mae_rnn = mean_absolute_error(y_test, y_pred_test)
rmse_rnn = np.sqrt(mean_squared_error(y_test, y_pred_test))
r2_rnn = r2_score(y_test, y_pred_test)
print("RNN (test)")
print(f"MAE : {mae_rnn:.4f}")
print(f"RMSE: {rmse_rnn:.4f}")
print(f"R2 : {r2_rnn:.4f}")
print("\nComparación vs baseline naive:")
print(f"Mejora MAE : {((mae_naive - mae_rnn)/mae_naive)*100:.2f}%")
print(f"Mejora RMSE: {((rmse_naive - rmse_rnn)/rmse_naive)*100:.2f}%")
[1m12/12[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step RNN (test) MAE : 0.2099 RMSE: 0.2575 R2 : 0.8995 Comparación vs baseline naive: Mejora MAE : 9.70% Mejora RMSE: 11.41%
n_plot = 220
plt.figure(figsize=(14, 4))
plt.plot(y_test[:n_plot], label="Real", linewidth=2)
plt.plot(y_pred_test[:n_plot], label="Predicción RNN", linewidth=2)
plt.title("Serie real vs predicha (tramo de test)")
plt.xlabel("Índice en test")
plt.ylabel("Valor")
plt.legend(); plt.show()
plt.figure(figsize=(6, 6))
plt.scatter(y_test, y_pred_test, alpha=0.5)
min_v, max_v = min(y_test.min(), y_pred_test.min()), max(y_test.max(), y_pred_test.max())
plt.plot([min_v, max_v], [min_v, max_v], "r--", label="Ideal")
plt.title("Real vs Predicho")
plt.xlabel("Valor real"); plt.ylabel("Valor predicho"); plt.legend(); plt.show()
errors = y_test - y_pred_test
plt.figure(figsize=(10, 4))
sns.histplot(errors, bins=40, kde=True)
plt.title("Distribución de errores (real - predicho)")
plt.xlabel("Error"); plt.show()
def train_short_experiment(window_size, epochs=20):
X, y = make_windows(df["signal"].values, window_size=window_size)
X = X[..., np.newaxis]
n = len(X)
tr_end, va_end = int(0.70 * n), int(0.85 * n)
X_tr, y_tr = X[:tr_end], y[:tr_end]
X_va, y_va = X[tr_end:va_end], y[tr_end:va_end]
X_te, y_te = X[va_end:], y[va_end:]
mu, sigma = X_tr.mean(), X_tr.std()
X_tr = (X_tr - mu) / sigma
X_va = (X_va - mu) / sigma
X_te = (X_te - mu) / sigma
y_tr = (y_tr - mu) / sigma
y_va = (y_va - mu) / sigma
m = keras.Sequential([
layers.Input(shape=(window_size, 1)),
layers.SimpleRNN(24, activation="tanh"),
layers.Dense(1)
])
m.compile(optimizer="adam", loss="mse", metrics=["mae"])
m.fit(X_tr, y_tr, validation_data=(X_va, y_va), epochs=epochs, batch_size=32, verbose=0)
pred_s = m.predict(X_te, verbose=0).squeeze()
pred = pred_s * sigma + mu
mae = mean_absolute_error(y_te, pred)
rmse = np.sqrt(mean_squared_error(y_te, pred))
return mae, rmse
results = []
for w in [20, 40, 80]:
mae_w, rmse_w = train_short_experiment(w, epochs=20)
results.append({"window_size": w, "MAE": mae_w, "RMSE": rmse_w})
res_df = pd.DataFrame(results)
display(res_df)
plt.figure(figsize=(8, 4))
plt.plot(res_df["window_size"], res_df["MAE"], marker="o", label="MAE")
plt.plot(res_df["window_size"], res_df["RMSE"], marker="s", label="RMSE")
plt.title("Sensibilidad al tamaño de ventana")
plt.xlabel("window_size")
plt.ylabel("Error")
plt.legend(); plt.show()
| window_size | MAE | RMSE | |
|---|---|---|---|
| 0 | 20 | 0.278508 | 0.351933 |
| 1 | 40 | 0.281490 | 0.365566 |
| 2 | 80 | 0.239398 | 0.301428 |
Conclusiones
- Una RNN simple aprende dependencias temporales de corto/medio alcance en esta serie con ruido.
- El enfoque many-to-one (ventana -> siguiente valor) funciona bien para forecasting a un paso.
- Comparar con un baseline ingenuo evita conclusiones engañosas.
- La longitud de ventana influye en el rendimiento.
Siguientes pasos
- Comparar con
LSTMyGRU. - Extender a predicción multi-step.
- Añadir variables exógenas y regularización.