CuriousY A world with wonder

Remote debugging with Pycharm

| Comment

一、在远程计算机上,需要:

  1. pydevd模块(在本地开发环境的PyCharm安装路径中找到pycharm-debug.egg文件(若远程计算机运行的是Python3,则需要pycharm-debug-py3k.egg),rename as pydevd for brevity.)

  2. default.py文件(文件名不重要),内容如下:

    #!/usr/bin/env python
    # coding=utf-8
    
    """ Sets the packages path and optionally starts the Python remote debugging client.
    
    The Python remote debugging client depends on the settings of the variables defined in debug_conf.py.
    Set these variables in debug_conf.py to enable/disable debugging using either the JetBrains PyCharm or Eclipse PyDev
    remote debugging packages which must be copied to packages/pydebug.
    """
    
    import os
    import sys
    
    module_dir = os.path.dirname(os.path.realpath(__file__))
    # packages = os.path.join(module_dir, u'packages')
    # sys.path.insert(0, module_dir)
    sys.path.append(os.path.join(module_dir, 'pydebug'))
    
    remote_debugging = {
        'client_package_location': os.path.join(module_dir, 'pydebug'),
        'is_enabled': False,
        'host': None,
        'stderr_to_server': False,
        'stdout_to_server': False,
        'port': 15679,
        'suspend': True,
        'trace_only_current_thread': False,
        'overwrite_prev_trace': False,
        'patch_multiprocessing': False}
    
    def configure_remote_debugging():
    
        configuration_file = os.path.join(module_dir, 'debug_conf.py')
    
        if not os.path.exists(configuration_file):
            return
    
        execfile(configuration_file, remote_debugging)
    
        if remote_debugging['is_enabled'] and os.path.exists(remote_debugging['client_package_location']):
            import pydevd
            try:
                pydevd.settrace(
                    remote_debugging['host'],
                    remote_debugging['stderr_to_server'],
                    remote_debugging['stdout_to_server'],
                    remote_debugging['port'],
                    remote_debugging['suspend'],
                    remote_debugging['trace_only_current_thread'],
                    remote_debugging['overwrite_prev_trace'],
                    remote_debugging['patch_multiprocessing'])
            except SystemExit as e:
                pass  # don't stop just because we couldn't connect to the debugger
    
    configure_remote_debugging()
    
  3. debug_conf.py文件(所有可能需要修改的东西放在了这个文件里面~),内容如下:

    #!/usr/bin/env python
    
    host = 'localhost'
    port = 15679
    suspend = False
    is_enabled = True
    
Read more

Set variable from other module correctly

| Comment

Problem

假设有三个python文件,A.py, B.py, C.py。我想在A.py中放一些变量,其他文件会从A中调用其中的变量,从而起到共享全局变量的作用。如果A中的变量定义了就不改变了,这是没有问题的。问题是假设我想在B中改变A中的某个变量,比如A.a,使之重新赋值,然后C中使用A.a的时候是B修改过的值,这时候应该怎样做?

这里其实是有一些比较tricky的东西在里面的:比如在B.py中如果我是from A import a或者import A.a as a,那么我在B.py中直接对a赋值其实是不能改变A.a的值的。原因在于上述这两种import方式实际上类似于a=A.a,它在module B的namespace下面创建了一个变量指向了A.a,后续对这个变量重新赋值只是让这个B.a重新指向了一个新的对象,而A.a是没有被改变的(refer: stackoverflow)。

Solution 1

使用import A的方式来导入模块A,然后在B中直接对A.a进行赋值。但同时必须注意的是在其他任何要使用A.a的模块中也必须这样来import A,而不能用其他方式来import(原因和上面讲的一样,比如在C.py中我使用from A import a,那么如果是在我还没运行B.py中对A.a重新赋值的代码时就先执行了这句from A import a,那其实就有了一个C.a,而后执行的对A.a重新赋值的代码就没法影响到这个C.a了)。

Solution 2

使用可变对象来作为这个全局变量,比如让A.a为一个list,那么无论其他模块中如何import的,都是可以切实地修改到A.a以及得到修改过的A.a的。

从方便维护代码的角度,solution 2更好一点。

Do not use multiprocessing.Queue in multiprocessing.Pool

| Comment

Problem

直接上代码:

from multiprocessing import Process, Value, Array, Manager, Queue, Pool
import time

def func(n, a, q):
    # As stdout is print on this process and we cannot see, we use a file to display outputs.
    with open('/tmp/solution_one.txt', 'w') as f:
        for i in xrange(20):
            time.sleep(1)
            q.put(i)
            f.write(str(n.value))
            f.write(a[0])
            f.write('\n')

def main():
    num = Value('d', 1.1)
    arr = Array('u', ['a'])
    q = Queue()
    pool = Pool(1)
    result = pool.apply_async(func, (num, arr, q,))
    print result.get()


if __name__ == "__main__":
    main()

运行发现提示:

RuntimeError: Synchronized objects should only be shared between processes through inheritance

Reason

查了下,这些通过inheritance来共享的对象(参见ftofficer|张聪的blog » Python multiprocessing 使用手记2 – 跨进程对象共享),从实现上是通过管道来传递的,使用管道的一个前提是两个进程必须是父子进程,而进程池中的进程并不是由当前同一个父进程创建的,所以会报这个错误。

Solution

是不使用进程池,而直接用Process()来fork生成一个新进程;是不使用通过inheritance来共享的对象,而用multiprocessing.Manager().Queue(),通过proxy的方式来共享对象。

Install and setup Tor

| Comment

Tor is a command line tool to use its proxy service.

To install tor on OSX:

brew install tor

Then just type ‘tor’ to start the proxy service:

➜  data-collector git:(master) ✗ tor
May 26 14:12:25.869 [notice] Tor v0.2.7.6 running on Darwin with Libevent 2.0.22-stable, OpenSSL 1.0.2g and Zlib 1.2.5.
May 26 14:12:25.869 [notice] Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download#warning
May 26 14:12:25.869 [notice] Configuration file "/usr/local/etc/tor/torrc" not present, using reasonable defaults.
May 26 14:12:25.871 [notice] Opening Socks listener on 127.0.0.1:9050
May 26 14:12:25.000 [notice] Parsing GEOIP IPv4 file /usr/local/Cellar/tor/0.2.7.6/share/tor/geoip.
May 26 14:12:25.000 [notice] Parsing GEOIP IPv6 file /usr/local/Cellar/tor/0.2.7.6/share/tor/geoip6.
May 26 14:12:26.000 [notice] Bootstrapped 0%: Starting
May 26 14:12:26.000 [notice] Bootstrapped 5%: Connecting to directory server
May 26 14:12:26.000 [notice] Bootstrapped 80%: Connecting to the Tor network
May 26 14:12:26.000 [notice] Bootstrapped 85%: Finishing handshake with first hop
May 26 14:12:29.000 [notice] Bootstrapped 90%: Establishing a Tor circuit
May 26 14:12:30.000 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.
May 26 14:12:30.000 [notice] Bootstrapped 100%: Done

The default socks proxy address is 127.0.0.1:9050

So you can use the proxy sock now. To stop the server, just press control-c.

Read more

It's nothing but a test of kramdown syntax

| Comment

The kramdown syntax is based on the Markdown syntax and has been enhanced with features that are found in other Markdown implementations like Maruku, PHP Markdown Extra and Pandoc. However, it strives to provide a strict syntax with definite rules and therefore isn’t completely compatible with Markdown. (e.g. Need a blank line before for some syntax while Markdown syntax may not need that blank line.)


Setext Style H1(上面一行必须是空行)

Setext Style H2


atx Style H1(上面一行必须是空行)

atx Style H2

atx Style H3

atx Style H4

atx Style H5
atx Style H6

我是斜体

我是粗体


  • option 1 (使用*或+或-都可以)
  • option 2
  • option 3
    • nested option 1
    • nested option 2

  1. list 1
  2. list 2
  3. list 3

A sample blockquote.

Nested blockquotes are also possible.

Headers work too

This is the outer quote again.


Example: 我是简单的代码框

  我是大片的代码框
  在前面需要tab
  或4个以上的空格
This is also a code block.
​~~~
Ending lines must have at least as
many tildes as the starting line.
# Fixme: Hack to add our logger to flask server request handler.
WSGIRequestHandler.log_request = hacked_log_request


@WEB_SERVER.route('/_backend/register_splunk', methods=['POST'])
def register_splunk():
    data = request.json
    for data_type in data['data_types']:
        for func in REGISTER_OUTPUT_METHODS:
            func(data['splunk_uri'], data['splunk_username'], data['splunk_password'],
                 data['splunk_index'], data_type['source'], data_type['sourcetype'], int(data['time_range']) * 3600)
    return 'register successfully!'


@WEB_SERVER.route('/post/<int:post_id>')
def show_post(post_id):
    # show the post with the given id, the id is an integer
    return 'Post %d' % post_id

我是脚注1


我是inline插入的图片:alt text 我是引用插入的图片:![alt text][image_id] [image_id]/url/to/img.jpg “Title”


Default aligned Left aligned Center aligned Right aligned
First body part Second cell Third cell fourth cell
Second line foo strong baz
Third line quux baz bar
Second body      
2 line      
Footer row      

我是定义
定义的内容

  1. 脚注1 

| Page 24 of 25 |