php下载github所有仓库备份到本地
版权声明:
本文为博主原创文章,转载请声明原文链接...谢谢。o_0。
更新时间:
2019-05-28 13:27:36
温馨提示:
学无止境,技术类文章有它的时效性,请留意文章更新时间,如发现内容有误请留言指出,防止别人"踩坑",我会及时更新文章
在网上听说github.com的账号可能会被封掉,封掉后里面所有仓库代码都不能访问啦,吓的我赶紧想个方法把把上面的代码给备份下来以防万一,脚本是使用php写的可以自动下载自己仓库列表里面的代码备份到本地
<?php
//要排除的仓库名字
$exclude = ['Inspeckage', 'jadx', 'ace', 'wke', 'DuiLib_Ultimate', 'DuiLib_Redrain', 'SeetaFaceEngine', 'SortListView', 'mui', 'SublimeText3', 'iview-admin'];
//github仓库列表地址
$url = 'https://github.com/zhaokeli?tab=repositories';
//使用方法
$header = <<<eot
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Cookie: **********************************************
Host: github.com
Referer: https://github.com/zhaokeli?after=Y3Vyc29yOnYyOpK5MjAxOS0wMi0xN1QxMDoxNTo1MyswODowMM4J5MCa&tab=repositories
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36
eot;
function runCmd($cmdstr)
{
exec($cmdstr . ' 2>&1', $output, $return_val);
if ($return_val === 0) {
return implode("\r\n", $output);
} else {
return '';
}
}
/**
* opt请求数组数据
* @param [type] $url url地址
* @param [type] $postData post的数据数组key=>value对,如果此项有数据自动转换为post请求
* @param [type] $headers 请求头信息为一维数组
* @return [type] 返回网页内容
*/
function send_request($opt)
{
$conf = [
'url' => '',
'postdata' => [], //有数据时自动转为post
'headers' => [],
'post' => false, //默认为false,true时当前请求强制转为post
];
$conf = array_merge($conf, $opt);
if (is_string($conf['headers'])) {
$conf['headers'] = trim($conf['headers']);
$conf['headers'] = preg_split('/\r?\n/', $conf['headers']);
}
if (!$conf['url']) {
return '';
}
$ch = curl_init();
//跳过ssl检查项。
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_URL, $conf['url']);
if ($conf['postdata'] || $conf['post'] === true) {
curl_setopt($ch, CURLOPT_POST, 1);
$postdata = http_build_query($conf['postdata']);
$conf['postdata'] && curl_setopt($ch, CURLOPT_POSTFIELDS, $postdata); //设置post数据
} else {
curl_setopt($ch, CURLOPT_POST, 0);
}
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 60); //超时
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); //自动递归的抓取重定向后的页面,0时禁止重定向
$conf['headers'] && curl_setopt($ch, CURLOPT_HTTPHEADER, $conf['headers']); //设置请求头
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //返回内容不直接输出到浏览器,保存到变量中
$result = curl_exec($ch);
if ($error = curl_error($ch)) {
die($error);
}
curl_close($ch);
return $result;
}
function downloadbranchs()
{
//先取本地分支
$locbran = runCmd('git branch');
$locbran = explode("\n", $locbran);
array_walk($locbran, function (&$value) {
$value = trim($value);
$value = str_replace('* ', '', $value);
});
// var_dump($locbran);
// die();
//取远程分支
$str = runCmd('git ls-remote');
$arr = explode("\n", $str);
$tags = [];
$branchs = [];
foreach ($arr as $key => $value) {
$temarr = preg_split('/\s+/', $value);
$con = $temarr[1];
$name = str_replace('refs/heads/', '', $con);
if ($name == 'master') {
continue;
}
if (strpos($con, 'refs/heads/') !== false) {
$branchs[] = $name;
} else {
$tags[] = $name;
}
}
//本地没有的下载远程分支
foreach ($branchs as $key => $value) {
if (in_array($value, $locbran)) {
exec("git checkout {$value}");
echo "git pull...\n";
exec('git pull');
} else {
exec("git checkout -b {$value} origin/{$value}");
}
}
//下载完成后切回主分支
exec('git checkout master');
}
function downloadres($url, $name)
{
global $exclude;
echo "=======================================================================\n";
echo "download {$url} \n";
$resname = trim($name, '/');
list($username, $resname) = explode('/', $resname);
if (in_array($resname, $exclude)) {
echo "{$resname} is excluded!\n";
return;
}
$githubres = __dir__ . '/GitHubRepositories';
if (!file_exists($githubres)) {
mkdir($githubres);
}
$respath = $githubres . '/' . $resname;
if (file_exists($respath)) {
chdir($respath);
echo "git pull {$url} \n";
exec('git pull ' . $url);
downloadbranchs();
} else {
chdir($githubres);
echo "git clone {$url} \n";
exec('git clone ' . $url);
chdir($respath);
downloadbranchs();
}
}
while ($url) {
echo "request url:" . $url . "\n";
$con = send_request([
'url' => $url,
'headers' => $header,
]);
$url = '';
$reg = '<a href="(.*?)"\sitemprop\="name\scodeRepository">';
preg_match_all($reg, $con, $match);
$reslist = $match[1];
foreach ($reslist as $key => $value) {
$resurl = "https://github.com" . $value . ".git\n";
flush();
file_put_contents(__dir__ . '/github.txt', $resurl, FILE_APPEND);
downloadres($resurl, $value);
}
//查找是不是有下一页
if (preg_match('/<a\srel="nofollow"\shref="([^<>]*?)">Next<\/a>/', $con, $mat)) {
$url = $mat[1];
}
}使用说明
如果有些仓库代码特别大而自己又不想下载的话可以在代码中的排除变量数组中写进云就可以啦,执行后会自动在当前目录创建一个目录来保存所有代码如下

因为工具里主要使用的是php和git命令,所以使用前要先确保php和git的执行路径已经添加到系统的环境变量里
找一个空目录把上面代码保存成download-github.php文件,然后在当前目录创建一个bat文件里面写入
@echo off php download-github.php pause
注意事项
因为github里分为私有仓库和公共仓库,如果想下载私有仓库,就需要你先登陆github然后把登陆后的请求头中的cookie填入代码里的headers变量里,并且保证在命令行中clone你的私有仓库正常,如果不正常的话一般也是让你添加访问key