python - Steps approximation for time series scatter with mean changing every K number of steps using BIC - Stack Overflow

时间: 2025-01-06 admin 业界

First, the synthetic data used is generated the following the way:

import sympy as sp 
import numpy as np
import matplotlib.pyplot as plt
import random
import math

np.random.seed(2)

n_samples = 180
time = np.arange(n_samples)

mean_value = random.randrange(60, 90)
mean = np.full(n_samples, mean_value)

# The fixed mean segment is generated randomly
K = random.randint(10, 40)

for i in range(K, n_samples, K):
    mean[i:] = mean[i - K] - 10  

noise = np.random.randn(n_samples) * random.normalvariate(4, 2)
y = mean + noise

I'm lost as to how to approximate the means and detect changes in means considering that there is noise involved and the variance is constant across the steps but is nevertheless unknown, so far I have the likelihood function L as a normal likelihood function, but I'm lost as to how to use it knowing that BIC = -2Log(L) the code I have so far is

def find_optimal_change_point(data):
    min_bic = float('inf')
    bics = np.full(len(data),min_bic)
    change_points = [0] * K

    for i in range(1, len(data)):  
        segment = data[:i]
        
        mean, var = np.mean(data[:i - 1]), np.var(data[:i - 1])

        N = len(segment)
        S = var
        new_bic = bic(N, S, 4, segment, v, len(segment),index=i) 

        if bics[i-1] > new_bic:
            bics[i] = new_bic
            change_points[k] = data[i]
            
        elif bics[i] < bics[i-1] :
            break
        
    return change_points

I need to graph the steps over the scatter graph where the vertical lines in the step follow the mean of each step and the height of the steps connects the two points where the mean has changed