Ranked 10000 in Code Jam 2021 Qualification, Perl + Scala + Dart

I spent the Saturday competing in Google Code Jam 2021 qualification round, and earned a well-rounded rank: exactly 10000.

yoursunny rank 10000

Code Jam is about Algorithms

My first time competing in Code Jam was 2012. I heard about Code Jam earlier that year, from a flyer posted by the ACM student chapter at University of Arizona. Since then, I participated in Code Jam every year, and 2021 is my 10th year competing in Code Jam.

Code Jam isn't about code or jam, but about mathematics and algorithms. Although I'm a good network programmer and write a lot of production quality code, I struggle at algorithms especially when I'm under a time pressure. Nevertheless, I continued competing in this annual event, so that I can remember the last bits of algorithms, and occasionally learn a few new tricks by reading through the post-round analyses.

yoursunny.com is Served by Caddy

The last rebuild of yoursunny.com was in Spring 2017, when I moved the whole website into git repositories. It's been more than 3 years, and I think I should share an update on a few changes in the stack that serves this website.

History of HTTP Servers Behind yoursunny.com

Since 2011, my HTTP server of choice was lighttpd. Then, I have PHP running in FastCGI mode to serve the dynamic pages. It works, but I don't really like the lighttpd's script-like configuration structure. Moreover, there were suspected memory leaks in my setup, so that I had to use a cron job to restart the HTTP server weekly.

I keep hearing good words about nginx, as well as the benefits of running PHP in FPM mode. In 2013, I made the switch to nginx and PHP-FPM. The declarative configuration of nginx is easy to understand and makes sense to me.

HTTPS came to yoursunny.com at the end of 2015. Like many other websites, the TLS certificates were issued by Let's Encrypt, and requested through certbot command line client. One problem is that, my VPS at the time had only 64MB memory, and certbot would not work in such a small amount of memory. I had to request the TLS certificate on my laptop, and then upload it to the VPS.

Alexa, ask UA Campus Rec when does the gym open?

I built an Alexa Skill, and got it published.

How I Learned Alexa Skill Kit and Why I Picked it Up Again

I learned about Alexa Skill Kit at Hack Arizona 2017, and built an incredibly complicated 520bus skill to query departure times of Sun Tran buses and Sun Link modern streetcar. I didn't win a prize, and quickly forgot about that.

A few months later, a tweet caught my attention: publish a skill, get an Echo Dot & Alexa dev swag. I always wanted to own a hardware device that integrates with Alexa. I considered the "practically free" Dash Wand, but gave up when I found out it is powered by batteries. I couldn't afford to buy an Echo, because I already spent all my money on those ESP8266 toys. However, I'm not going to miss a FREE Echo Dot.

How to Deduplicate BibTeX Entries?

I'm writing my dissertation recently. My dissertation is a combination of several publications from my PhD career. Therefore, part of my dissertation writing process involves copy-pasting the papers together into a single document.

Like any good academics, I typeset my publications with LaTeX, and use BibTeX to incorporate citations into the documents. My collection of bibliographies is fairly ad hoc: during each writing project, I search for related work to cite in my paper. Unlike most others, I create a separate bib file for each BibTeX entry named after the citation key. For example, I would have bib/ndn-tlv.bib for a BibTeX entry named "ndn-tlv", and bib/Mininet.bib for the "Mininet" entry. This allows me to find available citation keys with a quick glance over the bib/ directory. My build process then concatenates these small bib files into ref.bib as an input to BibTeX.

My dissertation combines all my publications, and thus needs a union of BibTeX entries from those combinations. To make this union, I can copy all these single-entry bib files into the same directory. If two previous papers cited the same reference, their bib files should have the same name, and only one copy would be left in the combined directory.

Except that the above assumption is true only if I cited the same reference with the same citation key. And so I discovered a citation appearing twice in my dissertation:

duplicate references in dissertation

yoursunny.com is in Git and Completely Rebuilt

I started making websites since 2001. This website, yoursunny.com, started in 2006. In the past 11 years, I've rebuilt the site several times, switched from ASP to PHP, and moved from Windows dedicated hosting to shared hosting and eventually to Linux VPS. So far, every time I want to perform a major edit to the website, I copy original versions of affected files to a backup folder on my computer, and then go ahead to do the edit. After having tested the modification locally, I upload changed files via FTP or SFTP to the server. One constant worry over my head is, what if I lose all the files on my computer, and my hosting provider vanishes so I can't get them back? Another headache is, sometimes I may make an edit incorrectly, but I couldn't revert it back because I didn't copy the original files to the backup folder as I determined the change wasn't "major" enough to warrant a backup.

During my studies at University of Arizona, I learned a useful tool called git. Git is a source control system: it allows developers to create a repository and put source code into the repository, and will automatically keep track of all the edits applied to each file. By putting website source code into a git repository, I can find out what modifications I've performed to each file over time, regardless of whether it is "major" or "minor". Additionally, I can synchronize the git repository with a remote git server, so that the server has a copy of my website, including edit histories as well. This would solve both the worry of losing files, and the headache of not having an earlier version to revert to in case of an incorrect modification.

After delaying this projects several times, I am finally determined to move yoursunny.com into git repositories in Apr 2017. At the same time, to keep the website source code as clean as possible, I decide to try out two new technologies: static site generators and Composer. That is, I would rebuild yoursunny.com, copying page by page, into a new website stored in git repositories.

I spent about 2 months for this rebuild/move, and I'm happy to announce that yoursunny.com is now under source control.

Main Site with Composer

纯真IP数据库批量查询

纯真IP数据库(qqwry.dat)批量查询PHP脚本,qqwry.lib.php。此脚本适合一次性查询大量IP地址的归属地,比如在几万行的网站日志文件中添加IP归属地。

下载纯真IP数据库批量查询PHP脚本

使用前请获取纯真IP数据库存储qqwry.dat于同一目录。

接口使用示例

require 'qqwry.lib.php';
$qqwry=new QQWRY;
$ip_arr=array();
$ip_arr[]=ip2long('8.8.8.8');
$ip_arr[]=ip2long('10.0.0.1');
$results=$qqwry->query($ip_arr);
foreach ($results as $ip=>$record) {
    echo long2ip($ip).' '.$record->get_c().' '.$record->get_a()."\n";
}

又一个加密PHP脚本的解码方法

三个星期以前我发布了一篇文章,介绍了base64加密的PHP脚本的解码方法。前几天,飞信好友行者又扔来了一段更加复杂、诡异的PHP脚本:

下载每一步的源代码

//0.php
<?php
$OOO0O0O00=__FILE__;$OOO000000=urldecode('%74%68%36%73%62%65%68%71%6c%61%34%63%6f%5f%73%61%64%66%70%6e%72');$OO00O0000=26408;$OOO0000O0=$OOO000000{4}.$OOO000000{9}.$OOO000000{3}.$OOO000000{5};$OOO0000O0.=$OOO000000{2}.$OOO000000{10}.$OOO000000{13}.$OOO000000{16};$OOO0000O0.=$OOO0000O0{3}.$OOO000000{11}.$OOO000000{12}.$OOO0000O0{7}.$OOO000000{5};$O0O0000O0='OOO0000O0';eval(($$O0O0000O0('JE9PME9PMDAwMD0kT09PMDAwMDAwezE3fS4kT09PMDAwMDAwezEyfS4kT09PMDAwMDAwezE4fS4kT09PMDAwMDAwezV9LiRPT08wMDAwMDB7MTl9O2lmKCEwKSRPMDAwTzBPMDA9JE9PME9PMDAwMCgkT09PME8wTzAwLCdyYicpOyRPTzBPTzAwME89JE9PTzAwMDAwMHsxN30uJE9PTzAwMDAwMHsyMH0uJE9PTzAwMDAwMHs1fS4kT09PMDAwMDAwezl9LiRPT08wMDAwMDB7MTZ9OyRPTzBPTzAwTzA9JE9PTzAwMDAwMHsxNH0uJE9PTzAwMDAwMHswfS4kT09PMDAwMDAwezIwfS4kT09PMDAwMDAwezB9LiRPT08wMDAwMDB7MjB9OyRPTzBPTzAwME8oJE8wMDBPME8wMCwxMTgyKTskT08wME8wME8wPSgkT09PMDAwME8wKCRPTzBPTzAwTzAoJE9PME9PMDAwTygkTzAwME8wTzAwLDkwOCksJ0kvTU5LQUNkVlJHUXlEV1VncTY4d3BrYXpMTzVsdG5tVEIrMGJ2OXVIcnhGN1hTWTFFM2ZaaGlqYzRlMm9Kc1A9JywnQUJDREVGR0hJSktMTU5PUFFSU1RVVldYWVphYmNkZWZnaGlqa2xtbm9wcXJzdHV2d3h5ejAxMjM0NTY3ODkrLycpKSk7ZXZhbCgkT08wME8wME8wKTs=')));return;?>
tiBr5CwHGMBrljDvtMTb6AqwwAJ8qpRkqpRmpbA6wh7uwZp6pbp6aZ4/8wwua6brR+zHVkp3LktrGMlHGMcxaMcrUiqHzkvSzk4lQ9DY56voGMBlQ+rlaMcrUiBFyuDH5jplQ9DY56voGMBlQ+rlaMcrUiEYziA7OCJftMbuQMqVpAqgahDAwvLAwvJkgpR8k3t8qpRkqpRm8bADq6ttG6brmd1HGCvflipZGMqmwZp6pbp6k3tVpAqgaZBUwhgua6brR+zHVkp3LktrGMlHGMcxaMcrUiqHzkvSzk4lQ9DY56voGMBlQ+rlaMcrUiBFyuDH5jplQ9DY56voGMBlQ+rlaMcrUiEYziA7OCJftMbuQMqmwZp6pbp6k3tVpAqgaZBUwhgua6brG6vbOkwHRomF3Fu81JU31P7TLCBBOk4B5+405iZTOC73liBYt6405iij1o2GHe3EYFazEsYGStUNH2r75iDB5CBYlj6BH2UyZUG4S7Q3EsY/x7shwpC0S0gED8V1DMlrW3qU8fI18fI18fIJljq3ajRvlCEBziwHRhJmqbvyqpJmR31+R3VSRKJU8f/UyKo1yMc+R3V7GMqU8Zo1yNI18fIHRKJUyKJUyN/UyMTb8Zo18Zo1yN/UGMqUyNI18f/UyNI7RKJUyN/UyNI1yMb7RZbY8w4QgwDbpvRdwavKphpul8zctj/Fzary8fp7tC4XpKVFyCRiWapVluBCDhB8k8AAyiLOOCvxzfqvy9JGlhIJR31ugwRNqKpCqZBR6bXy8w4UwAA6whqppvtzkprBz9DbLkLuOCvxOiEX59J1laRftdpitjB4n0IEy0yZD8zjWNbFQ3lrG6brWiL05CJfL6Tb8fI1yKo18fI1G8Xvt9A7GMqU8fI18fI18fIrW1==Ngr3LaAhOaRvGMlSQ+J05CAfl3J05i4SLkDZQu/HlMlrW1ZGOk405dpbL6TuQ+cYziEBljyYLCRmljA7Qu/HlMlrW1ZGOk405dpbL6TuQ+cYziEBljyY(后面还有大量数据,省略)

其中,在?>后面的数据足有27KB(共27316字节)。显然,这些数据并不是直接输出给客户端的,而要在服务端经过一定的处理。这27KB的数据看起来很像base64编码,但是直接用base64_decode解码得不到任何有意义的结果。

仔细观察,在前面的PHP代码部分有一个eval。那就按照上一篇文章的办法,把它改成echo试试!

base64加密PHP脚本的解码方法

PHP是网站服务端最流行的编程语言之一。PHP运行环境本身是开源的,服务器不加载插件时PHP脚本也无法加密。但是,总有人因为商业上的考虑,而将PHP程序通过各种方法进行混淆,使读者很难看到清晰易懂的代码。

然而,PHP运行环境的本质决定了,被混淆、编码的PHP脚本总是有办法恢复成可读的代码的。本文介绍了一种对含有LAVEbase64_decode的、被加密的PHP的解码方法。

在使用这种方法之前,你应该准备好:

  • 能运行PHP的Web服务器,例如 Apache 或 IIS
  • wget.exe命令行客户端 或 浏览器
  • 具备PHP语法高亮功能的文本编辑器,例如 Notepad2

下载每一步的源代码

P2P流量监控Web平台的设计与实现

摘要

P2P点对点传输的不断增长占用大量骨干网带宽,影响关键业务的正常运行。本文首先介绍了主要的P2P协议识别技术,讨论了改进思路。然后安装了一个基于iptables和l7filter的P2P流量监控节点,并用AJAX和PHP编写了一个Web监控平台以便通过Web管理监控节点、在图形界面中查看P2P流量日志,组成了一套P2P流量监控系统。

关键词 P2P,流量,监控,网络,l7filter

ABSTRACT

Fast growth of Peer-to-Peer traffic are taking more and more backbone network bandwidth, and may slow down key businesses. In this paper, major P2P identification techniques and a way to enhance them are introduced at first. Then we set up a P2P traffic detection router based on iptables and l7filter. A Web application is designed and implemented with AJAX and PHP, so that a network administrator can manage the router or view router logs in this Web application. The router and the Web application make up a P2P detection system.

dbMySQL数据源访问类

dbMySQL类,用于PHP访问MySQL数据库,而编程更加方便、安全性更高。

<?php
class dbMySQL {//数据源访问类
	private $conn;//连接
	private $prefix;//表名前缀
	private $sql=NULL;//上一条SQL语句
	private $result=NULL;//结果
	private $rows=0;//行数
	private $rows_got=0;//已提取行数
	//$db=new dbMySQL(主机,用户名,密码,数据库名,表名前缀,是否永久连接);
	function __construct($host,$username,$password,$dbname,$prefix='',$pconnect=TRUE) {//连接
		$connect_function=$pconnect?'mysql_pconnect':'mysql_connect';
		$this->conn=@$connect_function($host,$username,$password) or die('不能连接到MySQL数据源服务');
		mysql_query("SET NAMES 'utf8'",$this->conn);
		@mysql_select_db($dbname,$this->conn) or die('MySQL数据源中数据库不存在');
		$this->prefix=$prefix;
	}
	private function close() {//如果上一个结果存在,清除它
		if ($this->result!=NULL) mysql_free_result($this->result);
	}
	public function execute_sql($SQL) {//执行
		//echo $SQL;
		$this->close();
		$this->sql=$SQL;
		mysql_query($SQL,$this->conn);
		$err=mysql_error($this->conn); if ($err!='') die($err);
		$this->result=NULL;
		$this->rows=mysql_affected_rows($this->conn);
		$this->rows_got=0;
	}
	public function execute($SQL,$p=NULL,$prefix='###') {//带参数执行
		$s=str_replace($prefix,$this->prefix,$SQL);
		if ($p==NULL) { $this->execute_sql($s); return; }
		foreach ($p as $i => $v) {
			if (ctype_digit($i{0}) && !is_numeric($v)) die('SQL数值型参数错误 '.$i.'=>'.$v);
			$vv=ctype_lower($i{0})?"'".mysql_escape_string($v)."'":$v;
			$s=str_replace('?'.$i,$vv,$s);
		}
		$this->execute_sql($s);
	}
	//带参数执行、查询说明
	//数组$p "a"->"ppp"
	//SELECT * FROM j WHERE m=?a 解析为 SELECT * FROM j WHERE m='ppp'ִ
	//参数名称第一位是小写字母,作为字符型参数,编码并加单引号
	//参数名称第一位是数字,作为数字型参数,如果不是数字就出错
	//参数名称第一位是其他情况,直接带入SQL表达式
	//$prefix表示前缀占位符,默认###,FROM ###kk在前缀是pr_时解析为FROM pr_kk
	public function query_sql($SQL) {//查询
		//echo $SQL;
		$this->close();
		$this->sql=$SQL;
		$this->result=mysql_query($SQL,$this->conn);
		$err=mysql_error($this->conn); if ($err!='') die($err);
		$this->rows=mysql_num_rows($this->result);
		$this->rows_got=0;
	}
	public function query($SQL,$p=NULL,$prefix='###') {//带参数查询
		$s=str_replace($prefix,$this->prefix,$SQL);
		if ($p==NULL) { $this->query_sql($s); return; }
		foreach ($p as $i => $v) {
			if (ctype_digit($i{0}) && !is_numeric($v)) die('SQL数值型参数错误 '.$i.'=>'.$v);
			$vv=ctype_lower($i{0})?"'".mysql_escape_string($v)."'":$v;
			$s=str_replace('?'.$i,$vv,$s);
		}
		$this->query_sql($s);
	}
	public function read() {//提取一行为关联数组,如果已提取完则返回NULL
		if ($this->result==NULL) return NULL;
		if ($this->rows_got==$this->rows) return NULL;
		++$this->rows_got;
		return mysql_fetch_assoc($this->result);
	}
	public function num_rows() {//总行数
		return $this->rows;
	}
	public function eof() {//是否提取完了?
		return ($this->rows_got==$this->rows);
	}
}
?>

dbMySQL优点

  • 高安全性:强制检查类型、自动转义,避免SQL注入漏洞
  • 方便编程:面向对象语法,不必写mysql_fetch_assoc等长长的函数名
  • 方便编程:仿reader机制,直接从对象读出数据,不必再写$resultID
  • 支持表名前缀:允许在一个数据库中运行多个应用

dbMySQL样例代码