遗传算法实战：Python代码拆解与优化问题求解-程序员充电站

1. 遗传算法入门：从生物进化到代码实现

第一次接触遗传算法时，我被它模拟自然选择的巧妙设计惊艳到了。想象一下，你养了一池塘的金鱼，每年只留下最漂亮的几条繁殖后代，几代之后整个鱼群都会变得赏心悦目——这就是遗传算法的核心思想。在计算机世界里，我们用这个原理来解决那些传统方法难以处理的优化问题。

遗传算法特别适合以下场景：

需要优化复杂函数的最值（比如寻找山峰的最高点）
解空间太大无法穷举（比如有上百万种可能的组合）
问题没有标准解法（比如旅行商问题）

我用Python实现遗传算法时，通常会建立这几个关键组件：

种群：一组可能的解，相当于鱼群
DNA编码：把解转化为可遗传的形式
适应度函数：评估每个解的优劣
选择机制：决定哪些解能产生后代
遗传操作：交叉和变异产生新解

# 最简单的二进制编码示例 import random def create_individual(length): """创建一个随机个体""" return [random.randint(0,1) for _ in range(length)] def create_population(size, dna_length): """初始化种群""" return [create_individual(dna_length) for _ in range(size)]

2. 核心模块拆解：手把手构建算法框架

2.1 DNA编码的艺术

DNA编码就像给解决方案设计"遗传密码"。我做过一个智能调参项目，需要同时优化5个参数，最终采用了这种混合编码方案：

前3个参数用8位二进制表示（0-255）
第4个参数用4位表示类别（16种类别）
第5个参数用浮点数直接编码

def decode_dna(dna): """解码混合编码的DNA""" param1 = int(''.join(map(str, dna[0:8])), 2) param2 = int(''.join(map(str, dna[8:16])), 2) param3 = dna[16:20] # 类别编码 param4 = float(f"{dna[20]}.{''.join(map(str, dna[21:]))}") return [param1, param2, param3, param4]

2.2 适应度函数设计实战

适应度函数是算法的导航系统。在优化一个物流路径问题时，我用了这个技巧：

def fitness(route, cities): """计算路径适应度""" total_distance = 0 for i in range(len(route)-1): city_a = cities[route[i]] city_b = cities[route[i+1]] total_distance += ((city_a[0]-city_b[0])**2 + (city_a[1]-city_b[1])**2)**0.5 # 添加小常数避免除零错误 return 1/(total_distance + 1e-6)

注意：适应度值应该总是非负的，且更好的解对应更大的值。对于最小化问题，可以用倒数或负值转换。

3. 完整案例：求解函数极值

3.1 问题描述与参数设置

我们来优化这个"火山口"函数：

import numpy as np def target_function(x, y): return 3*(1-x)**2*np.exp(-x**2-(y+1)**2) - \ 10*(x/5-x**3-y**5)*np.exp(-x**2-y**2) - \ np.exp(-(x+1)**2-y**2)/3

配置算法参数：

DNA_SIZE = 24 # 每个变量的二进制位数 POP_SIZE = 200 # 种群规模 CROSS_RATE = 0.8 # 交叉概率 MUTATION_RATE = 0.003 N_GENERATIONS = 50 X_BOUND = [-3, 3] # x,y的取值范围 Y_BOUND = [-3, 3]

3.2 完整实现代码

def get_fitness(pred): """将函数值转换为适应度""" return (pred - np.min(pred)) + 1e-3 def translate_dna(pop): """解码DNA到实际值""" x_pop = pop[:,1::2] # 奇数列作为x y_pop = pop[:,::2] # 偶数列作为y x = x_pop.dot(2**np.arange(DNA_SIZE)[::-1])/float(2**DNA_SIZE-1)*(X_BOUND[1]-X_BOUND[0])+X_BOUND[0] y = y_pop.dot(2**np.arange(DNA_SIZE)[::-1])/float(2**DNA_SIZE-1)*(Y_BOUND[1]-Y_BOUND[0])+Y_BOUND[0] return x, y def crossover_and_mutation(pop, CROSS_RATE): new_pop = [] for father in pop: child = father if np.random.rand() < CROSS_RATE: mother = pop[np.random.randint(POP_SIZE)] cross_points = np.random.randint(0, 2, DNA_SIZE*2).astype(np.bool) child[cross_points] = mother[cross_points] mutation(child) new_pop.append(child) return new_pop def mutation(child, MUTATION_RATE=MUTATION_RATE): if np.random.rand() < MUTATION_RATE: mutate_point = np.random.randint(0, DNA_SIZE*2) child[mutate_point] ^= 1 def select(pop, fitness): idx = np.random.choice(np.arange(POP_SIZE), size=POP_SIZE, replace=True, p=fitness/fitness.sum()) return pop[idx] # 主循环 pop = np.random.randint(2, size=(POP_SIZE, DNA_SIZE*2)) for _ in range(N_GENERATIONS): x, y = translate_dna(pop) pred = target_function(x, y) fitness = get_fitness(pred) pop = select(pop, fitness) pop = np.array(crossover_and_mutation(pop, CROSS_RATE))

4. 性能优化与实用技巧

4.1 加速计算的秘诀

在真实项目中，我发现了这些优化点：

向量化计算：用NumPy矩阵运算替代循环

# 慢速版 fitness = [] for ind in population: fitness.append(calculate_fitness(ind)) # 快速版 population_array = np.array(population) fitness = calculate_fitness(population_array)

并行评估：使用multiprocessing

from multiprocessing import Pool def evaluate_parallel(population): with Pool(4) as p: return p.map(evaluate_individual, population)

记忆化存储：缓存已计算过的解

from functools import lru_cache @lru_cache(maxsize=10000) def cached_fitness(dna_tuple): return calculate_fitness(list(dna_tuple))

4.2 调参经验分享

经过多个项目的实践，我总结出这些参数设置规律：

参数	推荐范围	调整策略
种群大小	50-500	问题越复杂，种群越大
变异率	0.001-0.01	前期可以稍高，后期降低
交叉率	0.7-0.95	高交叉率适合解空间平滑的问题
迭代次数	50-1000	观察适应度曲线，收敛后即可停止

当算法陷入局部最优时，可以尝试：

增加变异率
引入移民策略（定期加入随机新个体）
使用多种群并行进化

# 自适应变异率示例 def adaptive_mutation_rate(generation, max_generations): base_rate = 0.01 return base_rate * (1 - generation/max_generations)

5. 常见问题与解决方案

5.1 早熟收敛问题

在优化一个无人机路径规划项目时，整个种群在20代后就停止了进化。通过以下改进解决了问题：

多样性检测：定期计算种群相似度

def population_diversity(population): return np.mean([np.std(ind) for ind in population.T])

重启策略：当多样性低于阈值时

if population_diversity(pop) < threshold: pop = initialize_population()

5.2 约束处理技巧

对于有约束条件的问题，我常用这些方法：

惩罚函数法：在适应度中减去违规程度

def constrained_fitness(x): penalty = 0 if x[0] < 0: penalty += 1000 if x[1] > 10: penalty += 1000 return raw_fitness(x) - penalty

修复法：将非法解调整为合法

def repair(x): x[0] = max(0, x[0]) # 确保不小于0 x[1] = min(10, x[1]) # 确保不大于10 return x

6. 进阶应用：多目标优化

当需要同时优化多个目标时，可以采用NSGA-II算法框架。这是我实现的一个简化版：

def fast_non_dominated_sort(population): fronts = [[]] for ind in population: ind.domination_count = 0 ind.dominated_set = [] for other in population: if dominates(ind, other): ind.dominated_set.append(other) elif dominates(other, ind): ind.domination_count += 1 if ind.domination_count == 0: fronts[0].append(ind) i = 0 while fronts[i]: next_front = [] for ind in fronts[i]: for dominated_ind in ind.dominated_set: dominated_ind.domination_count -= 1 if dominated_ind.domination_count == 0: next_front.append(dominated_ind) i += 1 fronts.append(next_front) return fronts[:-1] def dominates(a, b): """判断a是否支配b""" better_in_one = False for f_a, f_b in zip(a.fitness, b.fitness): if f_a < f_b: # 假设都是最大化 return False if f_a > f_b: better_in_one = True return better_in_one

7. 工程实践建议

在实际项目中部署遗传算法时，这些经验可能会帮到你：

日志记录：保存每代的最佳适应度

import csv with open('evolution_log.csv', 'w') as f: writer = csv.writer(f) writer.writerow(['generation', 'best_fitness']) for gen in range(N_GENERATIONS): # ...进化过程... writer.writerow([gen, best_fitness])

可视化监控：实时观察进化过程

import matplotlib.pyplot as plt plt.ion() fig, ax = plt.subplots() x = [] y = [] line, = ax.plot(x, y) for gen in range(N_GENERATIONS): # ...更新数据... x.append(gen) y.append(best_fitness) line.set_data(x, y) ax.relim() ax.autoscale_view() plt.pause(0.1)

提前停止：当改进小于阈值时终止

best_history = [] patience = 5 no_improve = 0 while no_improve < patience: # ...进化过程... if len(best_history) > 1 and (best_history[-1] - best_history[-2]) < 1e-4: no_improve += 1 else: no_improve = 0 best_history.append(current_best)

8. 与其他算法的结合

在最近的一个智能调度系统中，我将遗传算法与局部搜索结合，效果提升了约30%：

def hybrid_optimizer(): # 遗传算法全局搜索 population = initialize_population() for _ in range(N_GENERATIONS//2): population = evolve(population) # 对优秀个体进行局部搜索 elites = select_elites(population, top_n=10) for elite in elites: local_search(elite) # 继续进化 population = replace_worst(population, elites) for _ in range(N_GENERATIONS//2): population = evolve(population)

这种混合策略的优势在于：