保姆级教程：用Python把COCO格式的json标签转成YOLOv5能用的txt文件-创锋一号

从COCO到YOLOv5：零基础完成JSON标签格式转换实战

第一次接触目标检测时，最让人头疼的往往不是模型训练本身，而是数据准备阶段的各种格式转换。上周帮实验室新生处理COCO数据集时，发现网上大多数教程要么过于简略，要么代码存在隐藏bug。这次我们就用最直白的方式，从数据结构解析到完整Python实现，彻底解决这个看似简单却暗藏玄机的格式转换问题。

1. 理解两种数据格式的本质差异

1.1 COCO JSON的DNA解析

打开任意COCO格式的JSON文件，你会看到五个核心字段像俄罗斯套娃一样嵌套：

{ "info": {}, # 数据集元信息 "licenses": [], # 版权声明 "images": [ # 图像文件列表 { "id": 0, "width": 640, "height": 480, "file_name": "000001.jpg" } ], "annotations": [ # 标注数据集合 { "id": 1, "image_id": 0, "category_id": 2, "bbox": [x,y,width,height] } ], "categories": [ # 类别定义 { "id": 1, "name": "person" } ] }

关键点在于annotations中的bbox采用左上角坐标+宽高表示法（XYWH），这与YOLO的中心点坐标+宽高表示法有着本质区别。举个例子：

COCO格式：[100, 150, 200, 300]表示左上角在(100,150)，宽200像素，高300像素
YOLO格式：0 0.3125 0.5 0.625 0.75表示类别0，中心点在(0.3125,0.5)，宽度0.625，高度0.75（归一化值）

1.2 YOLO TXT的生存法则

YOLOv5要求的标签文件是纯文本格式，每行对应一个物体标注，其核心特征：

归一化数值：所有坐标值必须除以图像宽高转换为0-1之间的浮点数
中心点表示：<class> <x_center> <y_center> <width> <height>
文件对应规则：image.jpg对应image.txt，两者需同名同目录

典型目录结构示例：

dataset/ ├── images/ │ ├── train/ │ │ ├── 000001.jpg │ │ └── 000002.jpg └── labels/ ├── train/ ├── 000001.txt └── 000002.txt

2. 转换工程的四大核心模块

2.1 数据预处理检查清单

开始编码前，请确认以下事项：

路径验证：

import os json_path = "coco/annotations/instances_train2017.json" assert os.path.exists(json_path), f"JSON文件不存在: {json_path}"

数据采样检查：

import json with open(json_path) as f: data = json.load(f) print(f"总图像数: {len(data['images'])}") print(f"总标注数: {len(data['annotations'])}") print(f"示例标注: {data['annotations'][0]}")

关键字段验证表：

字段	必须存在	示例值	备注
images[].id	✓	397133	唯一标识
images[].file_name	✓	"000000397133.jpg"	带扩展名
annotations[].image_id	✓	397133	对应图像ID
annotations[].bbox	✓	[472.81, 125.66, 103.29, 85.67]	XYWH格式

2.2 坐标转换的数学本质

转换过程本质上是两个数学运算：

COCO → YOLO中心点：

x_center = x + width/2 y_center = y + height/2

归一化处理：

x_center /= image_width y_center /= image_height width /= image_width height /= image_height

用Python实现就是：

def coco_to_yolo(bbox, img_w, img_h): x, y, w, h = bbox x_center = (x + w/2) / img_w y_center = (y + h/2) / img_h w_norm = w / img_w h_norm = h / img_h return [x_center, y_center, w_norm, h_norm]

2.3 高效批处理架构设计

直接遍历所有annotations的O(n)算法效率低下，推荐使用字典预处理的O(1)查询方案：

from collections import defaultdict # 建立图像ID到标注的映射 image_annots = defaultdict(list) for ann in data['annotations']: image_annots[ann['image_id']].append(ann) # 建立图像ID到图像信息的映射 image_info = {img['id']: img for img in data['images']}

2.4 完整转换脚本实现

import json import os from pathlib import Path def convert_coco_to_yolo(json_path, output_dir): with open(json_path) as f: data = json.load(f) # 创建类别ID到连续索引的映射 categories = {cat['id']: idx for idx, cat in enumerate(data['categories'])} # 预处理数据结构 image_annots = defaultdict(list) for ann in data['annotations']: image_annots[ann['image_id']].append(ann) image_info = {img['id']: img for img in data['images']} # 确保输出目录存在 os.makedirs(output_dir, exist_ok=True) for img_id, annots in image_annots.items(): img = image_info[img_id] img_w, img_h = img['width'], img['height'] # 生成对应的txt文件名 txt_name = Path(img['file_name']).stem + '.txt' txt_path = os.path.join(output_dir, txt_name) with open(txt_path, 'w') as f: for ann in annots: # 坐标转换 x, y, w, h = ann['bbox'] x_center = (x + w/2) / img_w y_center = (y + h/2) / img_h w_norm = w / img_w h_norm = h / img_h # 获取类别索引 class_idx = categories[ann['category_id']] # 写入YOLO格式 line = f"{class_idx} {x_center:.6f} {y_center:.6f} {w_norm:.6f} {h_norm:.6f}\n" f.write(line) # 使用示例 convert_coco_to_yolo( json_path="coco/annotations/instances_train2017.json", output_dir="coco/labels/train2017" )

3. 避坑指南与性能优化

3.1 新手常见错误TOP5

路径陷阱：
- 错误：直接使用Windows路径C:\Users\name\data
- 正确：使用原始字符串r"C:\Users\name\data"或Path对象

归一化遗漏：

# 错误：忘记归一化 x_center = (x + w/2) # 正确： x_center = (x + w/2) / img_w

类别ID不连续：
- COCO原始ID可能不连续（如1,2,4,7）
- 建议重建连续索引（0,1,2,3）
图像尺寸错误：
- 某些数据集可能包含0x0尺寸图像
- 需添加过滤：
```
if img_w == 0 or img_h == 0: continue
```
浮点数精度：
- 直接str()转换可能导致科学计数法
- 使用f"{value:.6f}"固定小数点位数

3.2 高级优化技巧

内存优化版（处理超大数据集）：

import ijson def stream_convert(json_path, output_dir): os.makedirs(output_dir, exist_ok=True) with open(json_path, 'rb') as f: # 流式处理JSON images = ijson.kvitems(f, 'images') annots = ijson.items(f, 'annotations') # 其余处理逻辑类似...

多进程加速：

from multiprocessing import Pool def process_image(args): img_id, annots, image_info, categories = args # 转换逻辑... with Pool(processes=4) as pool: pool.map(process_image, chunked_data)

4. 验证转换结果的正确性

4.1 可视化检查工具

使用OpenCV绘制检测框验证：

import cv2 def visualize_yolo_label(img_path, label_path): img = cv2.imread(img_path) h, w = img.shape[:2] with open(label_path) as f: for line in f: class_id, xc, yc, bw, bh = map(float, line.split()) # 转换回像素坐标 x1 = int((xc - bw/2) * w) y1 = int((yc - bh/2) * h) x2 = int((xc + bw/2) * w) y2 = int((yc + bh/2) * h) cv2.rectangle(img, (x1,y1), (x2,y2), (0,255,0), 2) cv2.imshow('Validation', img) cv2.waitKey(0)

4.2 自动化验证脚本

def validate_conversion(original_json, yolo_labels_dir): # 统计原始JSON中的标注数量 with open(original_json) as f: data = json.load(f) original_count = len(data['annotations']) # 统计转换后的TXT文件标注数量 converted_count = 0 for txt_file in Path(yolo_labels_dir).glob('*.txt'): with open(txt_file) as f: converted_count += sum(1 for _ in f) assert original_count == converted_count, \ f"标注数量不匹配: 原始{original_count} != 转换后{converted_count}" print(f"验证通过！所有{original_count}个标注已正确转换")

企业官网建设流程全解析

从COCO到YOLOv5：零基础完成JSON标签格式转换实战

1. 理解两种数据格式的本质差异

1.1 COCO JSON的DNA解析

1.2 YOLO TXT的生存法则

2. 转换工程的四大核心模块

2.1 数据预处理检查清单

2.2 坐标转换的数学本质

2.3 高效批处理架构设计

2.4 完整转换脚本实现

3. 避坑指南与性能优化

3.1 新手常见错误TOP5

3.2 高级优化技巧

4. 验证转换结果的正确性

4.1 可视化检查工具

4.2 自动化验证脚本

热门文章

文章分类

标签云

需要专业的网站建设服务？

企业官网建设流程全解析

从COCO到YOLOv5：零基础完成JSON标签格式转换实战

1. 理解两种数据格式的本质差异

1.1 COCO JSON的DNA解析

1.2 YOLO TXT的生存法则

2. 转换工程的四大核心模块

2.1 数据预处理检查清单

2.2 坐标转换的数学本质

2.3 高效批处理架构设计

2.4 完整转换脚本实现

3. 避坑指南与性能优化

3.1 新手常见错误TOP5

3.2 高级优化技巧

4. 验证转换结果的正确性

4.1 可视化检查工具

4.2 自动化验证脚本

热门文章

文章分类

标签云

相关文章

量子数字孪生技术：噪声模拟与硬件保真度优化

Aegis-Veil：基于Linux命名空间的桌面应用沙箱隔离实践

强化学习在非真实感渲染中的并行推理与自蒸馏优化

需要专业的网站建设服务？