背景
公司的管理后台的列表导出 excel 功能,一次性导出上万条数据就会崩溃报错,而列表查询一天的数据行可能就有十几万条,无法满足需求,所以需要优化。
报错分析
运行环境:LNMP(ThinkPHP5框架)
代码:
public function exportExcel()
{
$data = $this->index(true);
$rows = [];
foreach ($data as $val) {
}
$excel = new Excel();
$excel->setColumnMean([])->setData($rows)->download('用户列表-'.date('YmdHis'));
}
1.直接导出接近 100 万条数据
超过了设置的内存限制值
2.如上代码,打开调试步骤3,看是否在查询数据库这一步报错的
报错结果同上,说明在查询数据库这一步超过了设置的内存限制值
3.代码恢复,打开调试步骤2,不限内存,再导出
报 404,应该是执行超时了
4.代码恢复,打开调试步骤2 和 3,再导出
还是报 404,说明在查询数据库这一步已经执行超时了
5.代码恢复,打开调试步骤1 和 2,不限执行时间不限内存,再导出
报 404
6.代码恢复,打开调试步骤1 和 2 和 3,再导出
报 404
问题总结:
mysql 一次性查询大批量数据,加载到 php 内存,导致内存溢出并且执行容易超时。
解决办法
关键词:非缓冲查询
public function exportExcel()
{
$head = [];
$sql = $this->index(true);
$this->putCsv($sql, '用户列表'.date('YmdHis'), $head);
}
private function putCsv($sql, $mark, $head)
{
set_time_limit(0);
try {
$filePath = ROOT_PATH . DS . 'runtime' . DS . 'temp' . DS . $mark . '.csv';
$fp = fopen($filePath, 'w');
fwrite($fp, chr(0xEF).chr(0xBB).chr(0xBF));
fputcsv($fp, $head);
$dbConfig = config('db_data');
$dsn = "mysql:dbname={$dbConfig['database']};host={$dbConfig['hostname']};port={$dbConfig['hostport']}";
$pdo = new \PDO($dsn, $dbConfig['username'], $dbConfig['password']);
$pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
$rows = $pdo->query($sql);
if ($rows) {
while ($v = $rows->fetch(\PDO::FETCH_ASSOC)) {
fputcsv($fp, $v);
}
}
fclose($fp);
header("Cache-Control: max-age=0");
header("Content-Description: File Transfer");
header('Content-Disposition: attachment; filename=' . basename($filePath));
header("Content-Type: text/csv;charset=utf8");
header("Content-Transfer-Encoding: binary");
header('Content-Length: ' . filesize($filePath));
@readfile($filePath);
unlink($filePath);
} catch (\Exception $e) {
Log::error('用户列表导出报错:'.$e->getMessage());
if (isset($filePath) && file_exists($filePath)) unlink($filePath);
$this->error('导出失败,请稍后再试');
}
}
原本代码里 sql 查询是用的 tp5 封装的 db 库,db 库其实也是对 PDO 的再封装(看下面 tp5 的代码)。db 库默认采用的缓冲查询,即每次都将查询结果一次过加载到 php 内存中。而非缓冲查询则是逐一加载到 php 内存中,适合查询大批量数据的场景。具体参考官方文档:Buffered and Unbuffered queries
thinkphp/library/think/db/Query.php
public function select($data = null)
{
if ($data instanceof Query) {
return $data->select();
} elseif ($data instanceof \Closure) {
call_user_func_array($data, [ & $this]);
$data = null;
}
$options = $this->parseExpress();
if (false === $data) {
$options['fetch_sql'] = true;
} elseif (!is_null($data)) {
$this->parsePkWhere($data, $options);
}
$resultSet = false;
if (empty($options['fetch_sql']) && !empty($options['cache'])) {
$cache = $options['cache'];
unset($options['cache']);
$key = is_string($cache['key']) ? $cache['key'] : md5($this->connection->getConfig('database') . '.' . serialize($options) . serialize($this->bind));
$resultSet = Cache::get($key);
}
if (false === $resultSet) {
$sql = $this->builder->select($options);
$bind = $this->getBind();
if ($options['fetch_sql']) {
return $this->connection->getRealSql($sql, $bind);
}
$options['data'] = $data;
if ($resultSet = $this->trigger('before_select', $options)) {
} else {
$resultSet = $this->query($sql, $bind, $options['master'], $options['fetch_pdo']);
if ($resultSet instanceof \PDOStatement) {
return $resultSet;
}
}
if (isset($cache) && false !== $resultSet) {
$this->cacheData($key, $resultSet, $cache);
}
}
if (!empty($this->model)) {
if (count($resultSet) > 0) {
foreach ($resultSet as $key => $result) {
$model = $this->model->newInstance($result);
$model->isUpdate(true);
if (!empty($options['relation'])) {
$model->relationQuery($options['relation']);
}
if (!empty($options['with_count'])) {
$model->relationCount($model, $options['with_count']);
}
$resultSet[$key] = $model;
}
if (!empty($options['with'])) {
$model->eagerlyResultSet($resultSet, $options['with']);
}
$resultSet = $model->toCollection($resultSet);
} else {
$resultSet = $this->model->toCollection($resultSet);
}
} elseif ('collection' == $this->connection->getConfig('resultset_type')) {
$resultSet = new Collection($resultSet);
}
if (!empty($options['fail']) && count($resultSet) == 0) {
$this->throwNotFound($options);
}
return $resultSet;
}
上面的 $this->query(); 最终跳到 thinkphp/library/think/db/Connection.php
public function query($sql, $bind = [], $master = false, $pdo = false)
{
$this->initConnect($master);
if (!$this->linkID) {
return false;
}
$this->queryStr = $sql;
if ($bind) {
$this->bind = $bind;
}
Db::$queryTimes++;
try {
$this->debug(true);
$this->PDOStatement = $this->linkID->prepare($sql);
$procedure = in_array(strtolower(substr(trim($sql), 0, 4)), ['call', 'exec']);
if ($procedure) {
$this->bindParam($bind);
} else {
$this->bindValue($bind);
}
$this->PDOStatement->execute();
$this->debug(false, '', $master);
return $this->getResult($pdo, $procedure);
} catch (\PDOException $e) {
if ($this->isBreak($e)) {
return $this->close()->query($sql, $bind, $master, $pdo);
}
throw new PDOException($e, $this->config, $this->getLastsql());
} catch (\Throwable $e) {
if ($this->isBreak($e)) {
return $this->close()->query($sql, $bind, $master, $pdo);
}
throw $e;
} catch (\Exception $e) {
if ($this->isBreak($e)) {
return $this->close()->query($sql, $bind, $master, $pdo);
}
throw $e;
}
}
另外,改用了性能更好的 csv,亲测 csv 比 PhpSpreadsheet 更能应付大批量数据导出。上面的方法能导出几十万乃至百万条数据,但是要注意对耗时的敏感程度。
|