Streaming API: Methods | dev.twitter.comより
Specifies keywords to track. Phrases of keywords are specified by a comma-separated list. Queries are subject to Track Limitations, described in Track Limiting and subject to access roles, described in the statuses/filter method. Comma separated keywords and phrases are logical ORs, phrases are logical ANDs. Words within phrases are delimited by spaces. A tweet matches if any phrase matches. A phrase matches if all of the words are present in the tweet. (e.g. 'the twitter' is the AND twitter, and 'the,twitter' is the OR twitter.). Terms are exact-matched, and also exact-matched ignoring punctuation. Exact matching on phrases, that is, keywords with spaces, is not supported. Keywords containing punctuation will only exact match tokens and, other than keywords prefixed by # and @, will tend to never match. Non-space separated languages, such as CJK and Arabic, are currently unsupported as tokenization occurs on whitespace. Other UTF-8 phrases should exact match correctly, but will not substitute similar characters to their least-common-denominator. For all these cases, consider falling back to the Search REST API.
#!/usr/bin/env ruby # coding: utf-8 require 'net/http' require 'uri' require 'rubygems' require 'json' USERNAME = '_USERNAME_' # ここを書き換える PASSWORD = '_PASSWORD_' # ここを書き換える # URLが変更になってるみたい(2010/07/22) #uri = URI.parse('http://stream.twitter.com/spritzer.json') uri = URI.parse('http://stream.twitter.com/1/statuses/sample.json') Net::HTTP.start(uri.host, uri.port) do |http| request = Net::HTTP::Get.new(uri.request_uri) # Streaming APIはBasic認証のみ request.basic_auth(USERNAME, PASSWORD) http.request(request) do |response| raise 'Response is not chuncked' unless response.chunked? response.read_body do |chunk| # 空行は無視する = JSON形式でのパースに失敗したら次へ status = JSON.parse(chunk) rescue next # 削除通知など、'text'パラメータを含まないものは無視して次へ next unless status['text'] user = status['user'] puts "#{user['screen_name']}: #{status['text']}" end end end
#!/usr/bin/python # -*- coding: utf-8 -*- import sys import tweepy import simplejson #from pit import Pit class StreamListener(tweepy.StreamListener): def on_data(self, data): data = simplejson.loads(data) if data.has_key('text'): print data['text'] def main(): user = 'xxxxx' passwd = 'yyyyy' stream = tweepy.Stream(user, passwd, StreamListener()) stream.filter(track=('#nhk',)) if __name__ == "__main__": main()
streaming.pyで処理が行われている。
def filter(self, follow=None, track=None, async=False): params = {} self.headers['Content-type'] = "application/x-www-form-urlencoded" if self.running: raise TweepError('Stream object already connected!') self.url = '/%i/statuses/filter.json?delimited=length' % STREAM_VERSION if follow: params['follow'] = ','.join(map(str, follow)) if track: params['track'] = ','.join(map(str, track)) self.body = urllib.urlencode(params) self._start(async)