Python Install Notes

Aug. 25, 2012, 1:57 a.m.

从源码安装 Python

./configure , make , sudo make install

安装 ez_setup

sudo /usr/local/bin/python ez_setup.py
sudo /usr/local/bin/easy_install pip
sudo /usr/local/bin/easy_install -U distribute

安装mysql-python

sudo yum install msyql-devel
sudo pip install -Iv http://pypi.python.org/packages/source/M/MySQL-python/MySQL-python-1.2.3.tar.gz

安装pyOpenSSL

sudo pip install -Iv http://pypi.python.org/packages/source/p/pyOpenSSL/pyOpenSSL-0.12.tar.gz

软件包列表

BeautifulSoup==3.2.1
MySQL-python==1.2.3
PIL==1.1.7
PyYAML==3.10
SQLAlchemy==0.7.6
Scrapy==0.14.2
Twisted==12.0.0
chardet==1.0.1
lxml==2.3.4
nltk==2.0.1rc4
pyOpenSSL==0.12
readability-lxml==0.2.5
thrift==0.8.0
virtualenv==1.7.1.2
w3lib==1.0
web.py==0.36
wsgiref==0.1.2
zope.interface==3.8.0

安装:

sudo pip install {包名}
or.
cat packages.list | cut -d '=' -f 1 | xargs sudo pip install

例如:

cat a | awk -F"==" '{print $1}' | sudo xargs /usr/local/bin/pip install

注意安装完成注意检查包名版本号不能低于之前的版本 检查命令:

pip freeze

Add to Google Reader - Bookmarklet

Sept. 19, 2012, 2:06 p.m.

一键添加到Google Reader

var b=document.body;
var GR________bookmarklet_domain='http://www.google.com';
if(b&&!document.xmlVersion){
    void(z=document.createElement('script'));
    void(z.src='http://www.google.com/reader/ui/subscribe-bookmarklet.js');
    void(b.appendChild(z));
}else{
    location='http://www.google.com/reader/view/feed/'+encodeURIComponent(location.href);
}

Django Notes

Aug. 23, 2012, 4:45 p.m.

配置Admin

开启admin相当于开启了一套用户管理系统,非常NICE. 首先检查一下django-admin.py的版本:

python django-admin.py --version

配置数据库(sqlite3).修改数据库配置. myproj/settings.py

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.sqlite3', 
        'NAME': os.path.dirname(__file__) + '/../db/myblog.db',                      
    }
}

对数据库进行初始化

python manage.py syncdb

输入管理员用户名密码. 忘记初始化数据库,会导致 django no such table: django_session 错误

修改 myproj/settings.py中INSTALLED_APPS项配置, 去掉django.contrib.admin前面的注释, 开启admin支持

INSTALLED_APPS = (
    #...
    'django.contrib.admin',
    #...
)

修改 myproj/urls.py, 去掉当中admin相关配置前面的注释

from django.conf.urls import patterns, include, url

from django.contrib import admin
admin.autodiscover()

urlpatterns = patterns('',
    url(r'^admin/', include(admin.site.urls)),
)

启动Django

python manage.py runserver 0.0.0.0:8000

通过 http://youhost:8000/admin即可访问.

配置静态资源

方法一

在项目目录下的settings文件里加一项:

STATIC_PATH='./static'       #静态文件所在目录

在项目目录下的urls文件里加一项:

from myproject import settings  #myproject为本工程名

在 urlpatterns里加一句

(r'^site_media/(?P<path>.*)$', \
    'django.views.static.serve', \
        {'document_root': settings.STATIC_PATH}),

方法二

修改myproj/settings.py

STATIC_URL = '/static/'

STATICFILES_DIRS = (
    os.path.dirname(__file__) + '/../static/',
)

写一个简单的应用

创建一个新app: python manager.py startapp blog

编写models-blog/models.py

#-*- encoding: UTF-8 -*- 
from django.db import models
from mongoengine import *
import datetime
import markdown

connect('blog', host='127.0.0.1', port=27017)

class Blog(Document):
    aid = StringField()
    title = StringField(required=True, default='title here')
    content_md = StringField(default="#header\n\npara")
    content_html = StringField()
    time_create = DateTimeField(default=datetime.datetime.now)
    time_update = DateTimeField()
    tags = ListField(StringField(max_length=30))

    def save(self):
        self.time_update = datetime.datetime.now()
        self.content_html = markdown.markdown( 
            self.content_md, 
            ['codehilite'])
        super(Blog, self).save()

编写controller-blog/views.py

from django.http import HttpResponse
from blog.models import Blog
from django.template import Context, loader

def index(request):
    blogs = Blog.objects()
    t = loader.get_template('blog/index.tpl')
    c = Context({
        'blog_list' : blogs,
    })
    return HttpResponse(t.render(c))

def detail(request, blog_id):
    return HttpResponse(blog_id)

修改路由 myproj/urls.py

urlpatterns = patterns('',
    url(r'^blog/$', 'blog.views.index'),
    url(r'^blog/(?P<blog_id>.+)/$', 'blog.views.detail'),
)

修改模板路径配置 myproj/settings.py

TEMPLATE_DIRS = (
    os.path.dirname(__file__) + '/../template/',
)

创建模板 ./template/blog/index.tpl

Response

JSON:

HttpResponse(json.dumps(res.values()), mimetype='text/plain')

Template:

HttpResponse(t.render(c))

Rsync Notes

Sept. 3, 2012, 11:47 a.m.

rsync -vzrtopg  --delete  ${local_dir} us01.v9.com:${dest_dir}

Bootstrap Resources

Aug. 28, 2012, 2:44 p.m.

Resources

Mongodb Create Index

Aug. 28, 2012, 12:02 p.m.

创建和查看索引

$mongod
>use paylog
>db.paylog.ensureIndex({'src' : 1})
>db.paylog.ensureIndex({'date' : 1})
>db.paylog.getIndexes() //indexes for collection
>db.system.indexes.find() //indexes for db

参考

Mongodb Data Import / Export

Aug. 28, 2012, noon

导出:

./bin/mongodump --host=127.0.0.1:27017 -dblog -o./blog_data

导入:

./bin/mongorestore --host=127.0.0.1:27017  ./blog_data

参考:

Mysql Data Import / Export

Aug. 28, 2012, 12:36 p.m.

Export:

#/bin/sh

if [ $# != 1 ]
then 
    echo "Usage: export_db.sh [path]"
    exit 1
fi

mysqldump database -v -h127.0.0.1 -uuser -ppassword > $1

Import:

#/bin/sh

if [ $# != 1 ]
then 
    echo "Usage: import_db.sh [path]"
    exit 1
fi

mysql -Ddatabase -h127.0.0.1 -uuser -ppassword < $1

Hadoop Install Notes

Aug. 24, 2012, 6:28 p.m.

搭建分布式Hadoop集群,准备至少两台机器。

建立信任关系

选择一台机器登录,创建RSA公钥和密钥。

$ ssh-keygen  -t rsa 
$ cd ~/.ssh 
$ cat id_rsa.pub  > authorized_keys 
$ chmod 600 authorized_keys

登录本机,检查信任关系是否成功,如果不需要输密码,就表示已经成功。

$ ssh localhost

将.ssh打包,分发到需要建立信任关系的机器上,然后将.ssh.tar.gz 在目标机器的主目录解压缩。

$ cd ~/ 
$ tar -czf .ssh.tar.gz .ssh

检查及安装JDK

不能使用open-jdk,JDK版本1.6+

$ java -version 
java version "1.6.0_22"
Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03, mixed mode)

下载以及配置Hadoop

Datetime / Time conversions

Aug. 24, 2012, 6:10 p.m.

Linux

Timestamp to datetime string

date -d @1343779200 "+%Y-%m-%d %H:%M:%S"

Timestamp to UTC datetime string

date -d @1343779200 -u "+%Y-%m-%d %H:%M:%S"

python

UTC datetime string to timestamp

import calendar

def utc_datestr2ts(date_str):

    if len(date_str) != 8:
    return 0

    time_tuple_utc = (int(date_str[0:4]), 
        int(date_str[4:6]), int(date_str[6:8]), 0, 0, 0)
    timestamp_utc = calendar.timegm(time_tuple_utc)
    return int(repr(timestamp_utc))

References

Linux Hosts Not Work

Aug. 24, 2012, 6:01 p.m.

/etc/hosts不起作用的原因:

[root@localhost ~]# ll /etc/nsswitch.conf -rw------- 1 root root 1716 Jun 4 18:29 /etc/nsswitch.conf 600的权限导致普通用户无法读取,改成644权限即可

sudo chown 644 /etc/nsswitch.conf

Nginx Domain Redirect

Aug. 24, 2012, 5:48 p.m.

if ($http_host = chengen.me ) {
    rewrite ^ http://www.chengen.me$request_uri? permanent;
}

Django静态资源配置

Aug. 24, 2012, 3:40 p.m.

对于Django静态资源的配置,由于没找到很简明的文档说明,走了些弯路,现在终于想通了。Django为了方便开发, 引入开发态和部署态的概念。

开发态下,我们直接用Django当webserver,这个时候Django需要配置静态资源的访问;部署态下,我们一般用cgi模式运行Django,前面挂webserver,静态资源由webserver负责。在Django的settings.py中,有两个配置用于静态资源路径,"STATIC_ROOT"和"STATICFILES_DIRS"。

STATIC_ROOT = os.path.dirname(__file__) + '/../webroot/static/'

STATICFILES_DIRS = (
    os.path.dirname(__file__) + '/../webroot.dev/static/',
)
  1. "STATIC_ROOT"其实是给 manager.py collectstatic 用的,collectstatic收集开发环境下的各种静态资源,然后统一收集到"STATIC_ROOT"目录下。

  2. "STATICFILES_DIRS"用于指定开发态下存放静态资源的路径, 可以指定多个,最终collectstatic会将这些静态资源都收集到 "STATIC_ROOT"下面。

搞清楚这一点就好办了,部署的时候,先运行如下命令,将静态文件收集到一起:

python manage.py collectstatic

然后配置webserver,将静态资源请求全部转发"STATIC_ROOT" 对应的目录即可。对应的nginx配置:

server {

    location ~* ^.+\.(html|jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|doc|xls|exe|pdf|ppt|txt|tar|mid|midi|wav|bmp|rtf|js)$
    {
        expires 30d;
        break;
    }

    location / {
        fastcgi_pass   unix:/home/work/serversoft/apps/myblog/var/myblog.sock;

        #necessary parameter
        fastcgi_param PATH_INFO $fastcgi_script_name;

        # to deal with POST requests
        fastcgi_param REQUEST_METHOD $request_method;
        fastcgi_param CONTENT_TYPE $content_type;
        fastcgi_param CONTENT_LENGTH $content_length;
    }
}

国家识别

Aug. 24, 2012, 3:01 p.m.

国家识别策略

  1. 检查cookie中是否有国家设置,如果有,以此为准;
  2. 根据geoip,获取请求来源国, 如果这个国家在我们的支持列表里, 以此为准;
  3. 检查Accept-Language,检查Accept-Lanuage和我们最Match的国家,如果找到,以此为准;
  4. 如果上面的方法都不能确定国家,跳转到默认的国家;

技术实现

Python Notes

Aug. 20, 2012, 7:09 p.m.

file operations

write file

out_file = open("test.txt", "wt")
out_file.write("This Text is going to out file\nLook at it and see!")
out_file.close()

file exists

import os
os.path.exists(filename)

read file

for i in open(filename).readlines():

For offline files:

open(filename).read()

For online files (e.g webpage):

import urllib2
urllib2.urlopen(url).read()

Read file with encoding

import codecs
text = codecs.open(filename, mode='r', encoding='utf8').read()

invloke parent''s instance method

super(Blog, self).save()

Perl Daemon

Aug. 24, 2012, 2:16 p.m.

sub become_daemon {
    die "Can't fork" unless defined (my $child = fork);
    exit 0 if $child;
    setsid();
    open( STDIN, "</dev/null" );
    open( STDOUT, ">/dev/null" );
    open( STDERR, ">&STDOUT" );
    chdir '/';
    umask(0);
    $ENV{PATH} = "/bin:/sbin:/usr/bin:/usr/sbin";
    return $$;
}

NGINX+Django Fast-cgi配置

Aug. 24, 2012, 1:14 p.m.

nginx配置

http://wiki.nginx.org/DjangoFastCGI

Command Line Blog TODO

Aug. 20, 2012, 7:06 p.m.

command line blog

mrd

  1. add/update/delete blog ;
  2. blog list;
  3. markdown and code hilighting
  4. tags
  5. blog search - full text search

tech prepare

  1. mongodb using mongoengine / apidoc

    http://mongoengine.org/

  2. markdown translate / markdown

    easy to extend, such as code highliting http://freewisdom.org/projects/python-markdown/

  3. code highliting

    http://pygments.org/

  4. python command line parser

design

Function design

  1. create blog

    blog create

  2. save blog

    blog save

  3. search and list all blog

    blog list [keyword]

  4. get blog markdown or html

    blog get [-type=html]

  5. delete blog, mark blog as deleted

    blog delete

blog file

  1. blog meta

    between the head of file utile the first empty line.

    key:value key:value

  2. blog content

    after the first emtpy line.

schema