PHP 代码行 diff 实现

今天拿到一个很神奇的函数,可以把新旧代码行的差异显示出来,说神奇是因为他的实现很简洁

<?php

//$old = 'h e l l o o o o o o o o / o o o o o 1 2 3 4';
//$new = 'h e l l o o o 0 0 ? o o o o o o o o o 1 A 3 4';
$old = 'hellllooo1234';
$new = 'helllO11OOoo1234';

function diff($old, $new){
    $matrix = array();
    $maxlen = 0;
    foreach($old as $oindex => $ovalue){
        $nkeys = array_keys($new, $ovalue);
        foreach($nkeys as $nindex){
            $matrix[$oindex][$nindex] = isset($matrix[$oindex - 1][$nindex - 1]) ?
                $matrix[$oindex - 1][$nindex - 1] + 1 : 1;
            if($matrix[$oindex][$nindex] > $maxlen){
                $maxlen = $matrix[$oindex][$nindex];
                $omax = $oindex + 1 - $maxlen;
                $nmax = $nindex + 1 - $maxlen;
            }
        }
    }
    if($maxlen == 0) return array(array('d'=>$old, 'i'=>$new));
    return array_merge(
        diff(array_slice($old, 0, $omax), array_slice($new, 0, $nmax)),
        array_slice($new, $nmax, $maxlen),
        diff(array_slice($old, $omax + $maxlen), array_slice($new, $nmax + $maxlen)));
}

function htmlDiff($old, $new){
    $ret = '';
    $diff = diff(preg_split("/[\s]+/", $old), preg_split("/[\s]+/", $new));
    foreach($diff as $k){
        if(is_array($k))
            $ret .= (!empty($k['d'])?"<del>".implode(' ',$k['d'])."</del> ":'').
                (!empty($k['i'])?"<ins>".implode(' ',$k['i'])."</ins> ":'');
        else $ret .= $k . ' ';
    }
    return $ret;
}

echo htmlDiff( implode(' ', str_split($old)), implode(' ', str_split($new)));

?>

实现的效果就是这样的

h e l l l <del>l o</del> <ins>O 1 1 O O</ins> o o 1 2 3 4 

具体的原理还在研究之中,回头补上说明

———————————————–

2013-11-26 11:15:28 update 注意到在字符串的预处理阶段,使用了 \s+ 这个正则来分割,那么,容易想到,构造这样的一个输入

$old = 'hello world';
$new = 'hello    world';

将会使得他的 diff 输出为无差异的错误结果

Leave a Reply

Your email address will not be published. Required fields are marked *