富文本编辑器通用防XSS函数 phpAntiXSS beta 1.0

富文本编辑器一直是防XSS最头疼的地方,本函数基于标签白名单机制,花了一天写出来,很多属性的值的正则都是随便写的,期待各位大神绕过。。。
phpAntiXSS beta 1.0 源码下载

[php]function antixss($html){
/*
* 标签白名单
* @allow_tag array
*/
$allow_tag = array(
"a", "span", "img", "p", "br",
"div", "strong", "b", "ul", "li", "ol", "u", "em"
);
/*
* 这里配置标签里允许什么属性,属性值的正则
* @allow_tag_attr array
*/
$allow_tag_attr = array(
‘*’ => array (
‘id’=>’/^[\w_-]+$/i’,
‘class’=>’/^[\w_-]+$/i’,
‘name’=>’/^[\w_-]+$/i’,
‘style’=>’/^[\w_-:=\s"\’]+$/i’,
‘value’=>’/.*/i’,
‘alt’=>’/.*/i’,
‘width’=>’/^[\w_-]+$/i’,
‘height’=>’/^[\w_-]+$/i’,
),
‘a’=>array(
‘href’=>’/.*/i’,
‘title’=>’/.*/i’,
‘target’=>’/^[\w_-]+$/i’,
),
‘img’ => array (
‘src’=>’/.*/i’,
),
);
//匹配出所有尖括号包含的字符串
preg_match_all(‘/<[^>]*>/s’,$html,$matches);

if($matches[0]){
$tags = $matches[0];
foreach($tags as $tag_k=>$tag){

//匹配出标签名 比如 a, br, html, li, script
preg_match_all(‘/^<\s{0,}\/{0,}\s{0,}([\w]+)/i’,$tag,$tag_name);
$tags[$tag_k] = array(‘name’=>$tag_name[1][0],’html’=>$tag);

if($tag_name && in_array($tags[$tag_k][‘name’],$allow_tag)){

//匹配出含等于号的属性,注,当前版本不支持readonly等无等于号的属性
preg_match_all(‘/\s{0,}([a-z]+)\s{0,}=\s{0,}["\’]{0,}([^\’"]+)["\’]{0,}[^>]/i’,$tag,$tag_matches);
if($tag_matches[0]){
$tags[$tag_k][‘attr’] = $tag_matches;
foreach($tags[$tag_k][‘attr’][1] as $k => $v){
$attr = $tags[$tag_k][‘attr’][1][$k];
$value = $tags[$tag_k][‘attr’][2][$k];
$preg_attr_all = $allow_tag_attr[‘*’][$attr];
$preg_attr = $allow_tag_attr[$tags[$tag_k][‘name’]][$attr];

//判断该属性是否允许,如不允许,则unset。
if((isset($preg_attr) || isset($preg_attr_all)) && (preg_match($preg_attr,$value) || preg_match($preg_attr_all,$value))){
$tags[$tag_k][‘attr’][0][$k] = "{$attr}='{$value}’";
}else{
unset($tags[$tag_k][‘attr’][0][$k]);
}
}
$tags[$tag_k][‘replace’] = ‘<‘.$tags[$tag_k][‘name’];
if(is_array($tags[$tag_k][‘attr’][0])) $tags[$tag_k][‘replace’] .= ‘ ‘.implode(‘ ‘,$tags[$tag_k][‘attr’][0]);
$tags[$tag_k][‘replace’] .= ‘>’;
}else{
$tags[$tag_k][‘replace’] = $tags[$tag_k][‘html’];
}
}else{
$tags[$tag_k][‘replace’] = ‘<removed source=\”.$tags[$tag_k][‘name’].’\’ />’;
}
$search[$tag_k] = $tags[$tag_k][‘html’];
$replace[$tag_k] = $tags[$tag_k][‘replace’];
}
$html = str_replace($search,$replace,$html);
}

return $html;
}[/php]

留下评论