最近公司网站需要查询网站是否有违禁词,还要查找一下周年的时间是否有错误的,上百个页面,又不想一个一个去查,就偷懒了一下,此功能只能查找文字,图片不行,如有大神请绕道。 刚开始就是想偷懒,不想一个一个看,思想原理就是用php 的curl 函数抓取网页源码,然后在查找一个有没有自己要的东西,上代码
<?php
if($_POST){
header("Content-Type: text/html; charset=utf-8");
function curl_get($url){
$header = array( 'Accept: application/json');
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_HEADER, 0);
curl_setopt($curl, CURLOPT_TIMEOUT, 1);
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
$data = curl_exec($curl);
if (curl_error($curl)) {
print "Error: " . curl_error($curl);
} else {
return $data;
curl_close($curl);
}
}
$url = isset($_POST['url']) ? $_POST['url'] : '';
$chat = isset($_POST['chat']) ? $_POST['chat'] : '';
$outPageTxt=curl_get($url);
$encode = mb_detect_encoding($outPageTxt, array("utf-8","GB2312","GBK"));
if(strtolower($encode) != "utf-8"){
$outPageTxt = mb_convert_encoding($outPageTxt, 'utf-8','GB2312');
}
$outPageTxt= htmlspecialchars_decode($outPageTxt);
$outPageTxt = preg_replace( "@<script(.*?)</script>@is", "", $outPageTxt );
$outPageTxt = preg_replace( "@<iframe(.*?)</iframe>@is", "", $outPageTxt );
$outPageTxt = preg_replace( "@<style(.*?)</style>@is", "", $outPageTxt );
$outPageTxt = preg_replace( "@<(.*?)>@is", "", $outPageTxt );
$outPageTxt=htmlspecialchars(trim(strip_tags($outPageTxt)));
if($chat!='') {
$chararr = explode("|", $chat);
foreach ($chararr as $k => $v) {
$outPageTxt = str_replace("$v", "<font color='red'>$v</font>", $outPageTxt);
$key=$k;
}
}
echo $outPageTxt;
}
这样后端口的一个简单和PHP就写了,主是是使用了php 的CURL 函数。
那么我开始创建前端html 代码
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport"
content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<title>大笨熊索引</title>
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/css/bootstrap.min.css">
<script src="http://code.jquery.com/jquery-2.1.1.min.js"></script>
</head>
<body>
<style>
.logo{ font-size: 50px;text-align: center; margin: 50px auto;}
.input-group{ width: 800px; margin: 0px auto;}
.main{ width: 1200px; height: 600px; overflow-y: auto; margin: 0px auto; line-height: 30px; box-shadow:10px 10px 10px #ccc; padding: 10px;}
.think-page-loader {
top: 0;
left: 0;
right: 0;
bottom: 0;
z-index: 9999999;
position: fixed;
text-align: center;
}
.loader {
top: 50%;
width: 50px;
height: 50px;
margin: -35px 0 0 -35px;
z-index: 999999;
display: inline-block;
position: fixed;
}
.loader:before {
top: 59px;
left: 0;
width: 50px;
height: 7px;
opacity: 0.1;
content: "";
position: absolute;
border-radius: 50%;
background-color: #000;
animation: shadow .5s linear infinite;
}
.loader:after {
top: 0;
left: 0;
width: 50px;
height: 50px;
content: "";
position: absolute;
border-radius: 3px;
background-color: #5FB878;
animation: loading .5s linear infinite;
}
}
@-webkit-keyframes loading {
17% {
border-bottom-right-radius: 3px;
}
25% {
transform: translateY(9px) rotate(22.5deg);
}
50% {
transform: translateY(18px) scale(1, 0.9) rotate(45deg);
border-bottom-right-radius: 40px;
}
75% {
transform: translateY(9px) rotate(67.5deg);
}
100% {
transform: translateY(0) rotate(90deg);
}
}
@keyframes loading {
17% {
border-bottom-right-radius: 3px;
}
25% {
transform: translateY(9px) rotate(22.5deg);
}
50% {
border-bottom-right-radius: 40px;
transform: translateY(18px) scale(1, 0.9) rotate(45deg);
}
75% {
transform: translateY(9px) rotate(67.5deg);
}
100% {
transform: translateY(0) rotate(90deg);
}
}
@-webkit-keyframes shadow {
0%,
100% {
transform: scale(1, 1);
}
50% {
transform: scale(1.2, 1);
}
}
@keyframes shadow {
0%,
100% {
transform: scale(1, 1);
}
50% {
transform: scale(1.2, 1);
}
}
.zf{width: 800px; height: 50px; margin: 0px auto; margin-bottom: 50px;}
.zf h2{ font-size: 14px; text-align: center;}
.zf textarea{ width:1200px; height: 50px; margin: 0px auto;}
</style>
<div class="logo">大笨熊索引</div>
<div class="input-group input-group-lg">
<input type="text" placeholder="Search" name="url" class="form-control" id="mySearchButton">
<span class="input-group-btn">
<button onclick="send()" type="button" class="btn btn-danger dropdown-toggle" data-toggle="dropdown" aria-expanded="false">Options <span class="caret"></span></button>
</span>
</div>
<div class="zf">
<label></label><input type="text" placeholder="查找字符,多个用”|“分开 如:大笨熊|老师" name="chat" class="form-control zf" id="">
</div>
<div class="think-page-loader" style="display: none;"><div class="loader"></div></div>
<div class="main" id="main">
<div id="loading"></div>
</div>
<script>
function send(){
if($('input[name="url"]').val()==''){
alert("请先输入域名!");
return false;
}
$.ajax({
url :"index.php",
type:'post',
data:{
url:$('input[name="url"]').val(),
chat:$('input[name="chat"]').val()
},
timeout:15000,
beforeSend:function(XMLHttpRequest){
$(".think-page-loader").css("display","block");
},
success:function(data,textStatus){
$(".main").html(data);
$(".think-page-loader").css("display","none");
},
complete:function(XMLHttpRequest,textStatus){
$(".think-page-loader").css("display","none");
},
error:function(XMLHttpRequest,textStatus,errorThrown){
alert('error...状态文本值:'+textStatus+" 异常信息:"+errorThrown);
$(".think-page-loader").css("display","none");
}
});
}
</script>
</body>
</html>
前端代码就简单多了,只要原来就是用AJAX实现无刷新,提交要查找的网址与字符串,下面上效果图 查找了csdn 站点里面,有开发和2021的地方,都标红出来,这样我就知道我这个页面,有没有我需要更改的地方,分享到此结束,如果大佬有更好的想法,可以留言。最后分享一下线上地址
在线测试 测试地址 https://www.dbenx.com/curl/get.html
|