That TDD Fellow | Tech Blog | Screencasts

Let’s stop fearing our own creations and start being in control of them. Let’s be professional.

Cfapps Mongoid Configuration

| Comments

cloudfoundry的mongo的数据库配置有点变动, 是采用了 mongolab 之后, 就需要使用其他的方式了。不过我一直没有找到合适的方法, 使用的是比较笨配置。

缘起, 部署到cf之后, 提示 instance 没有启动, 查看日志, 说mongo的数据库配置有问题。

          Cleaning up the bundler cache.
-----> Writing config/database.yml to read from DATABASE_URL
-----> Preparing app for Rails asset pipeline
       Running: rake assets:precompile
       rake aborted!
       Failed to connect to a master node at localhost:27017
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongo-1.9.0/lib/mongo/mongo_client.rb:492:in `connect'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongo-1.9.0/lib/mongo/mongo_client.rb:698:in `setup'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongo-1.9.0/lib/mongo/mongo_client.rb:155:in `initialize'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongo-1.9.0/lib/mongo/util/uri_parser.rb:171:in `new'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongo-1.9.0/lib/mongo/util/uri_parser.rb:171:in `connection'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongo-1.9.0/lib/mongo/mongo_client.rb:203:in `from_uri'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongoid-2.6.0/lib/mongoid/config/database.rb:86:in `master'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongoid-2.6.0/lib/mongoid/config/database.rb:19:in `configure'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongoid-2.6.0/lib/mongoid/config.rb:290:in `configure_databases'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongoid-2.6.0/lib/mongoid/config.rb:111:in `from_hash'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongoid-2.6.0/lib/mongoid/config.rb:126:in `block in load!'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongoid-2.6.0/lib/mongoid/config.rb:125:in `tap'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongoid-2.6.0/lib/mongoid/config.rb:125:in `load!'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongoid-2.6.0/lib/mongoid.rb:148:in `load!'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/mongoid-2.6.0/lib/mongoid/railtie.rb:84:in `block in <class:Railtie>'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/railties-3.2.12/lib/rails/initializable.rb:30:in `instance_exec'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/railties-3.2.12/lib/rails/initializable.rb:30:in `run'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/railties-3.2.12/lib/rails/initializable.rb:55:in `block in run_initializers'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/railties-3.2.12/lib/rails/initializable.rb:54:in `each'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/railties-3.2.12/lib/rails/initializable.rb:54:in `run_initializers'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/railties-3.2.12/lib/rails/application.rb:136:in `initialize!'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/railties-3.2.12/lib/rails/railtie/configurable.rb:30:in `method_missing'
       /tmp/staged/app/config/environment.rb:5:in `<top (required)>'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/railties-3.2.12/lib/rails/application.rb:103:in `require'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/railties-3.2.12/lib/rails/application.rb:103:in `require_environment!'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/railties-3.2.12/lib/rails/application.rb:297:in `block (2 levels) in initialize_tasks'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.12/lib/sprockets/assets.rake:93:in `block (2 levels) in <top (required)>'
       /tmp/staged/app/vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.12/lib/sprockets/assets.rake:60:in `block (3 levels) in <top (required)>'

幸运的是, 可以看到输出正确的关于mongo配置信息,

cf logs

输出信息:

VCAP_SERVICES={"mongolab-n/a":[{"name":"mongo","label":"mongolab-n/a","plan":"sandbox","credentials":{"uri":"mongodb://CloudFoundry_7ukkkk_gfdkdaep_vphqtlf5:7RO0rjBSFYMBUM26_mb_LPRJGsGVgYAG@ds033828.mongolab.com:33828/CloudFoundry_7uk5v37r_gfdiiiiep"}}]}  

使用mongo得到链接:

    require 'mongo'
    mongo_uri = 'mongodb://CloudFoundry_7ukkkk_gfdkdaep_vphqtlf5:7RO0rjBSFYMBUM26_mb_LPRJGsGVgYAG@ds033828.mongolab.com:33828/CloudFoundry_7uk5v37r_gfdiiiiep'

     conn = Mongo::Connection.from_uri(mongo_uri) 

使用pry可以很清楚的查看想要的信息, db_name, username, password, host以及port 然后把这些配置写到mongoid.yml中去, 就ok了, 没有好方法。

什么是MapReduce

| Comments

在MongoDB当中, MapReduce是一个可以并行化到多个服务器的聚合方法。它会拆分问题,再将各个部分发送到不同的机器, 让每台机器都完成一部分。当所有机器都完成的时候, 再把结果汇集起来形成最终完整的结果。

MapReduce需要几个步骤。最开始是映射(map), 将 操作 映射 到集合中的每个文档。这个操作要么 “do nothing”,要么”emit these keys with X values”, 然后就是中间环节,称作 洗牌(shuffle), 按照键分组, 并将产生的键值组成列表放到对应的键中。化简(reduce)则把列表中的值化简成一个单值。这个值被返回,然后接着进行洗牌, 直到每个键的列表只有一个值为止。

map方法():

    map = function(){
        for(var  key in this){
            emit(key, {count : 1});
        }
    };

reduce方法():

     reduce = function(key, emits) { 
        total = 0;           for (var i in emits) {              total += emits[i].count;                }          return {"count" : total};        }

对于reduce可以这样使用 r1 = reduce(“x”, [{count : 1, id : 1}, {count : 1, id : 2}]) r2 = reduce(“x”, [{count : 1, id : 3}]); reduce(“x”, [r1, r2])

数据库中的记录如下:

            { "_id" : ObjectId("513c62592a1b9fda7f000045"), "url" : "http://ecx.images-amazon.com/images/I/51sf-lIfZrL.jpg", "height" : 500, "width" : 500, "type" : "large", "updated_at" : ISODate("2013-03-10T10:38:30.386Z"), "created_at" : ISODate("2013-03-10T10:38:30.386Z") }
        { "_id" : ObjectId("513c6e352a1b9f3ac300024d"), "url" : "http://ecx.images-amazon.com/images/I/51JOaE64EFL._SL30_.jpg", "height" : 30, "width" : 16, "type" : "swatch", "item_id" : ObjectId("513c6e352a1b9f3ac3000253") }
        { "_id" : ObjectId("513c6e352a1b9f3ac300024e"), "url" : "http://ecx.images-amazon.com/images/I/51JOaE64EFL._SL75_.jpg", "height" : 75, "width" : 40, "type" : "small", "item_id" : ObjectId("513c6e352a1b9f3ac3000253") }
        { "_id" : ObjectId("513c6e352a1b9f3ac300024f"), "url" : "http://ecx.images-amazon.com/images/I/51JOaE64EFL._SL75_.jpg", "height" : 75, "width" : 40, "type" : "thumbnail", "item_id" : ObjectId("513c6e352a1b9f3ac3000253") }
        { "_id" : ObjectId("513c6e352a1b9f3ac3000250"), "url" : "http://ecx.images-amazon.com/images/I/51JOaE64EFL._SL110_.jpg", "height" : 110, "width" : 58, "type" : "tiny", "item_id" : ObjectId("513c6e352a1b9f3ac3000253") }
        { "_id" : ObjectId("513c6e352a1b9f3ac3000251"), "url" : "http://ecx.images-amazon.com/images/I/51JOaE64EFL._SL160_.jpg", "height" : 160, "width" : 85, "type" : "medium", "item_id" : ObjectId("513c6e352a1b9f3ac3000253") }
        { "_id" : ObjectId("513c6e352a1b9f3ac3000252"), "url" : "http://ecx.images-amazon.com/images/I/51JOaE64EFL.jpg", "height" : 500, "width" : 265, "type" : "large", "item_id" : ObjectId("513c6e352a1b9f3ac3000253") }            

执行MapReduce: db.blog.mapReduce(map, reduce, {out : “result_bak” })

注意,out 为新生成的一个collection, 可以使用db.result_bak.find()来查看

结果如下:

{
"result" : "result_bak",
"timeMillis" : 151,
"counts" : {
    "input" : 7,
    "emit" : 43,
    "reduce" : 6,
    "output" : 8
},
"ok" : 1,
}

生成的result_bak的结果如下:

db.result_bak.find()

{ "_id" : "_id", "value" : { "count" : 7 } }
{ "_id" : "created_at", "value" : { "count" : 1 } }
{ "_id" : "height", "value" : { "count" : 7 } }
{ "_id" : "item_id", "value" : { "count" : 6 } }
{ "_id" : "type", "value" : { "count" : 7 } }
{ "_id" : "updated_at", "value" : { "count" : 1 } }
{ "_id" : "url", "value" : { "count" : 7 } }
{ "_id" : "width", "value" : { "count" : 7 } }  

Deploy Rails Projects Using MongoDB and Thin to CloudFoundry

| Comments

我部署一个Rails项目到CloudFoundry, 过程颇为不顺,遇到很多问题,不过问题主要是自找的,没有仔细的看官方文档,想当然的觉得是那样,结果走了弯路。以下说说主要的问题。

该项目数据库使用的是MongoDB, 部署到CloudFoundry上, app server是默认的,即 WEBrick。

数据库问题:

  • 听说CloudFoundry支持MongoDB, 所以使用了它。以为cf会支持它所有的版本的,当时没注意,用了最新的版本的mongoid。 结果上去,死活不成功。

      解决办法: 降 mongoid 到 2.6或者以下的版本。并且改写 对应的 mongoid.yml 配置。
    
  • 生产环境下数据库配置:

      production:
       host: <%= JSON.parse( ENV['VCAP_SERVICES'] )['mongodb-2.0'].first['credentials']['hostname'] rescue 'localhost'%>
       port: <%= JSON.parse( ENV['VCAP_SERVICES'] )['mongodb-2.0'].first['credentials']['port'] rescue '27017'%>
       database: <%= JSON.parse( ENV['VCAP_SERVICES'] )['mongodb-2.0'].first['credentials']['db'] rescue 'buyintime_prod' %>
       username: <%= JSON.parse( ENV['VCAP_SERVICES'] )['mongodb-2.0'].first['credentials']['username'] rescue ''%>
       password: <%= JSON.parse( ENV['VCAP_SERVICES'] )['mongodb-2.0'].first['credentials']['password'] rescue ''%>
    

    开始的时候数据库的配置没按照了官方给的demo做,试了一些野路子,结果在这里困了很久。

App Server问题:

默认的WEBrick在表单提交的内容超出一定长度之后会抛出异常。

    415     def read_request_line(socket)
    416       @request_line = read_line(socket, MAX_URI_LENGTH) if socket
    417       if @request_line.bytesize >= MAX_URI_LENGTH and @request_line[-1, 1] != LF
    418         raise HTTPStatus::RequestURITooLarge
    419       end
    420       @request_time = Time.now
    421       raise HTTPStatus::EOFError unless @request_line
    422       if /^(\S+)\s+(\S++)(?:\s+HTTP\/(\d+\.\d+))?\r?\n/mo =~ @request_line
    423         @request_method = $1
    424         @unparsed_uri   = $2
    425         @http_version   = HTTPVersion.new($3 ? $3 : "0.9")
    426       else
    427         rl = @request_line.sub(/\x0d?\x0a\z/o, '')
    428         raise HTTPStatus::BadRequest, "bad Request-Line `#{rl}'."
    429       end
    430     end

该项目的提交的表单内容略长,不过也是在很多server可以接受的范围,但WEBrick不行。于是把它给换成了Thin,做的方法也很简单,在Gemfile中增加:

    gem 'thin'

查看数据库存储的内容:

cf提供了数据库访问的客户端,需要这样使用:

vmc tunnel

1: time
2: dos

Which service instance?> 1


1: none
2: mongo
3: mongodump
4: mongorestore
Which client would you like to start?> 1


Opening tunnel on port 10000... OK

Service connection info:
  username : ooooooo
  password : xxxxxxx
  name     : db
  url      : mongodb://gggyyyyyy@167.30.48.71:25238/db

把要访问的数据库的信息显示出来了。

然后另开一个console,在提示选择客户端的时候,选择 2: mongo即可,输入相应的认证信息,用户名、密码:

use db
db.auth(username, password)

如果了解了的话,其实没什么,主要是不熟悉”规则”。云平台看似用起来很方便,其实还是有很多需要注意的细节。

How to Use Dtrace Tracing Ruby Executing

| Comments

最近看了点关于Dtrace的东西,它是个通用型的工具,但我主要集中于分析ruby程序的执行上面。关于操作系统的性能分析,比如cpu、内存、io、文件系统等,使用起来貌似挺复杂,木有细看。

这里简单的输出一条命令:

sudo dtrace -n 'ruby$target:::object-create {@objects[copyinstr(arg0)]=count();}' -c 'ruby -e "puts :hello"'

输出的结果是:

    dtrace: description 'ruby$target:::object-create ' matched 1 probe
    -e:1: unterminated string meets end of file
    dtrace: pid 15203 has exited

      #<Class:0x007fabf38e0700>                                         1
      ARGF.class                                                        1
      IOError                                                           1
      Mutex                                                             1
      NoMemoryError                                                     1
      SyntaxError                                                       1
      SystemStackError                                                  1
      ThreadGroup                                                       1
      Time                                                              1
      LoadError                                                         2
      Object                                                            2
      Gem::Specification                                                8
      Gem::Version                                                     10
      Hash                                                             11
      Gem::Requirement                                                 23
      Array                                                            96
      String                                                          260

ruby2.0已经有支持probe了,所以可以使用dtrace

2.0之前的如果要使用dtrace的话,要使用 ruby-dtrace这个gem

另外在学习的时候,写了dtrace的脚本,其实就是D语言了, rb_flowinfo.d,查看ruby方法的调用过程:

    #!/usr/sbin/dtrace -Zs
    #pragma D option quiet
    #pragma D option switchrate=10

    self int depth;

    dtrace:::BEGIN
    {
       printf("%s %6s %10s %16s:%-4s %-8s -- %s\n", "C", "PID", "DELTA(us)", "FILE", "LINE", "TYPE", "NAME");
    }

    ruby*:::method-entry, 
    ruby*:::method-return
    /self->last == 0/
    {
       self->last = timestamp;
    }

    ruby*:::method-entry
    /copyinstr(arg0) == "Object"/
    {
       this->delta = (timestamp - self->last) / 1000;
       this->name = strjoin(strjoin(copyinstr(arg0), "::"), copyinstr(arg1));
       printf("%d %6d %10d %16s:%-4d %-8s %*s-> %s\n", cpu, pid, this->delta,
             basename(copyinstr(arg2)), arg3, "method", self->depth * 2, "", this->name);

       self->depth++;
       self->last = timestamp;
    }

    ruby*:::method-return
    /copyinstr(arg0) == "Object"/
    {
       this->delta = (timestamp - self->last) / 1000;
       self->depth -= self->depth > 0 ? 1 : 0;
       this->name = strjoin(strjoin(copyinstr(arg0), "::"), copyinstr(arg1));
       printf("%d %6d %10d %16s:%-4d %-8s %*s<- %s\n", cpu, pid, this->delta,
             basename(copyinstr(arg2)), arg3, "method", self->depth * 2, "", this->name);
       self->last = timestamp;
    }

用于测试的ruby脚本, trace_method_call.rb

    dtrace_ruby $ cat trace_method_call.rb 
    #!/Users/wenleslie/.rvm/rubies/ruby-2.0.0-p0/bin/ruby
    def func_c
       puts "Function C"
       sleep 1
    end

    def func_b
       puts "Function B"
       sleep 6
       func_c
    end

    def func_a
       puts "Function A"
       func_b
    end

    func_a      

执行:

    sudo ./rb_flowinfo.d -c ./trace_method_call.rb

结果如下:

C    PID  DELTA(us)             FILE:LINE TYPE     -- NAME
Function A
Function B
4  15702      35986 trace_method_call.rb:13   method   -> Object::func_a
4  15702         55 trace_method_call.rb:7    method     -> Object::func_b
Function C0  15702    6001065 trace_method_call.rb:2    method       -> Object::func_c

6  15702    1000922 trace_method_call.rb:5    method       <- Object::func_c
6  15702         27 trace_method_call.rb:11   method     <- Object::func_b
6  15702         19 trace_method_call.rb:16   method   <- Object::func_a
^C

我的evernote上关于dtrace笔记 有稍微详细一点的内容,有兴趣的话,可以看看