概况
OCR (Optical Character Recognition,光学字符识别)是最早的计算机视觉任务之一。人类使用电子设备(图像采集装置)采集现实场景中打印的字符,然后通过检测字符的形状,用字符识别的方法将其翻译成计算机文字。
相关链接:“英特尔创新大师杯”深度学习挑战赛
环境搭建
python -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
baseline
主要代码如下:
model = paddle.DataParallel(model)
program.train(config, train_dataloader, valid_dataloader, device, model,
loss_class, optimizer, lr_scheduler, post_process_class,
eval_class, pre_best_model_dict, logger, vdl_writer)
def train(config,
train_dataloader,
valid_dataloader,
device,
model,
loss_class,
optimizer,
lr_scheduler,
post_process_class,
eval_class,
pre_best_model_dict,
logger,
vdl_writer=None):
......
model.train()
......
for epoch in range(start_epoch, epoch_num + 1):
train_dataloader = build_dataloader(
config, 'Train', device, logger, seed=epoch)
for idx, batch in enumerate(train_dataloader):
images = batch[0]
preds = model(images)
loss = loss_class(preds, batch)
avg_loss = loss['loss']
avg_loss.backward()
optimizer.step()
optimizer.clear_grad()
......
save_model(
model,
optimizer,
save_model_dir,
logger,
is_best=False,
prefix='iter_epoch_{}'.format(epoch),
best_model_dict=best_model_dict,
epoch=epoch,
global_step=global_step)
实践结果
TBD
相关阅读
https://gitee.com/coggle/tianchi-intel-PaddleOCR
|