scrapy-splash

SplashのScrapyミドルウェア。pip install scrapy-splashでインストール。

プロジェクトの準備

1
2
3
4
5
6
7
$ scrapy startproject scrapy_splash_tutorial
New Scrapy project 'scrapy_splash_tutorial', using template directory '/usr/local/lib/python3.8/site-packages/scrapy/templates/project', created in:
/work/scrapy/scrapy_splash_tutorial

You can start your first spider with:
cd scrapy_splash_tutorial
scrapy genspider example example.com
1
2
3
4
5
6
7
8
9
10
11
12
.
├── scrapy.cfg
└── scrapy_splash_tutorial
├── __init__.py
├── __pycache__
├── items.py
├── middlewares.py
├── pipelines.py
├── settings.py
└── spiders
├── __init__.py
└── __pycache__

settings.pyをカスタマイズ

DOWNLOADER_MIDDLEWARES

1
2
3
4
5
DOWNLOADER_MIDDLEWARES = {
'scrapy_splash.SplashCookiesMiddleware': 723,
'scrapy_splash.SplashMiddleware': 725,
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware': 810,
}

Order 723 is just before HttpProxyMiddleware (750) in default scrapy settings.

ミドルウェアの優先度はHttpProxyよりも優先する必要があるため、750未満にする必要がある。

SPLASH_URL

SPLASH_URL =でSplashのURLを指定する。

1
SPLASH_URL = 'http://splash:8050/'

docker-composeで起動しているため、splashを使っている。

SPIDER_MIDDLEWARES

1
2
3
SPIDER_MIDDLEWARES = {
'scrapy_splash.SplashDeduplicateArgsMiddleware': 100,
}

SplashDeduplicateArgsMiddlewareを有効化する。これによって重複するリクエストをSplashサーバーに送らない。

DUPEFILTER_CLASS / HTTPCACHE_STORAGE

1
2
DUPEFILTER_CLASS = 'scrapy_splash.SplashAwareDupeFilter'
HTTPCACHE_STORAGE = 'scrapy_splash.SplashAwareFSCacheStorage'

リクエストのフィンガープリント計算をオーバーライドできないので、DUPEFILTER_CLASSHTTPCACHE_STORAGEを定義する。

Spiderの実装例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import scrapy
from scrapy_splash import SplashRequest

class MySpider(scrapy.Spider):
start_urls = ["http://example.com", "http://example.com/foo"]

def start_requests(self):
for url in self.start_urls:
yield SplashRequest(url, self.parse,
endpoint='render.html',
args={'wait': 0.5},
)

def parse(self, response):
# response.body is a result of render.html call; it
# contains HTML processed by a browser.
# …
  1. scrapy.Requestの代わりにSplashRequestを使用してページのレンダリング
  2. argsでSplashに引数として渡す
  3. endpointでデフォルトのエンドポイントであるrender.jsonからrender.htmlに変更

Spiderの例を元にquotesのJSページを実装する

JavaScriptでページを生成するhttp://quotes.toscrape.com/js/を対象にテストコードを作成する。

今回のスパイダーはquotesjsで作成。

1
2
3
$scrapy genspider quotesjs quotes.toscrape.com
Created spider 'quotesjs' using template 'basic' in module:
scrapy_splash_tutorial.spiders.quotesjs

ChromeのF12デバッグで内容を確認する

Chromeデバッグ width=640

Chromeデバッグ width=640

scrapy shellでページを解析する

shellはSplash経由で操作するため、scrapy shell 'http://splash:8050/render.html?url=http://<target_url>&timeout=10&wait=2'で起動する。
パラメーターのwait=2(秒数は対象にあわせて適切な値を)は重要で、指定なしではレンダリングが終わっていないHTMLが返却されることもある。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
$scrapy shell 'http://splash:8050/render.html?url=http://quotes.toscrape.com/js/'
2020-05-06 18:09:33 [scrapy.utils.log] INFO: Scrapy 2.1.0 started (bot: scrapy_splash_tutorial)
2020-05-06 18:09:33 [scrapy.utils.log] INFO: Versions: lxml 4.5.0.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.5.2, w3lib 1.21.0, Twisted 20.3.0, Python 3.8.2 (default, Apr 16 2020, 18:36:10) - [GCC 8.3.0], pyOpenSSL 19.1.0 (OpenSSL 1.1.1g 21 Apr 2020), cryptography 2.9.2, Platform Linux-4.19.76-linuxkit-x86_64-with-glibc2.2.5
2020-05-06 18:09:33 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
2020-05-06 18:09:33 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'scrapy_splash_tutorial',
'DUPEFILTER_CLASS': 'scrapy_splash.SplashAwareDupeFilter',
'EDITOR': '/usr/bin/vim',
'HTTPCACHE_STORAGE': 'scrapy_splash.SplashAwareFSCacheStorage',
'LOGSTATS_INTERVAL': 0,
'NEWSPIDER_MODULE': 'scrapy_splash_tutorial.spiders',
'ROBOTSTXT_OBEY': True,
'SPIDER_MODULES': ['scrapy_splash_tutorial.spiders']}
2020-05-06 18:09:33 [scrapy.extensions.telnet] INFO: Telnet Password: 2dd3dc32afe40826
2020-05-06 18:09:33 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.memusage.MemoryUsage']
2020-05-06 18:09:33 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy_splash.SplashCookiesMiddleware',
'scrapy_splash.SplashMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2020-05-06 18:09:33 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy_splash.SplashDeduplicateArgsMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2020-05-06 18:09:33 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2020-05-06 18:09:33 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2020-05-06 18:09:33 [scrapy.core.engine] INFO: Spider opened
2020-05-06 18:09:33 [scrapy.core.engine] DEBUG: Crawled (404) <GET http://splash:8050/robots.txt> (referer: None)
2020-05-06 18:09:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://splash:8050/render.html?url=http://quotes.toscrape.com/js/> (referer: None)
[s] Available Scrapy objects:
[s] scrapy scrapy module (contains scrapy.Request, scrapy.Selector, etc)
[s] crawler <scrapy.crawler.Crawler object at 0x7f8aaede0f10>
[s] item {}
[s] request <GET http://splash:8050/render.html?url=http://quotes.toscrape.com/js/>
[s] response <200 http://splash:8050/render.html?url=http://quotes.toscrape.com/js/>
[s] settings <scrapy.settings.Settings object at 0x7f8aaede0b20>
[s] spider <DefaultSpider 'default' at 0x7f8aaeb9a9a0>
[s] Useful shortcuts:
[s] fetch(url[, redirect=True]) Fetch URL and update local objects (by default, redirects are followed)
[s] fetch(req) Fetch a scrapy.Request and update local objects
[s] shelp() Shell help (print this help)
[s] view(response) View response in a browser
1
2
3
4
>>> response.css('.container .quote').get()
'<div class="quote"><span class="text">“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”</span><span>by <small class="author">Albert Einstein</small></span><div class="tags">Tags: <a class="tag">change</a> <a class="tag">deep-thoughts</a> <a class="tag">thinking</a> <a class="tag">world</a></div></div>'
>>> response.css('.container .quote').getall()
['<div class="quote"><span class="text">“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”</span><span>by <small class="author">Albert Einstein</small></span><div class="tags">Tags: <a class="tag">change</a> <a class="tag">deep-thoughts</a> <a class="tag">thinking</a> <a class="tag">world</a></div></div>', '<div class="quote"><span class="text">“It is our choices, Harry, that show what we truly are, far more than our abilities.”</span><span>by <small class="author">J.K. Rowling</small></span><div class="tags">Tags: <a class="tag">abilities</a> <a class="tag">choices</a></div></div>', '<div class="quote"><span class="text">“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”</span><span>by <small class="author">Albert Einstein</small></span><div class="tags">Tags: <a class="tag">inspirational</a> <a class="tag">life</a> <a class="tag">live</a> <a class="tag">miracle</a> <a class="tag">miracles</a></div></div>', '<div class="quote"><span class="text">“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”</span><span>by <small class="author">Jane Austen</small></span><div class="tags">Tags: <a class="tag">aliteracy</a> <a class="tag">books</a> <a class="tag">classic</a> <a class="tag">humor</a></div></div>', '<div class="quote"><span class="text">“Imperfection is beauty, madness is genius and it\'s better to be absolutely ridiculous than absolutely boring.”</span><span>by <small class="author">Marilyn Monroe</small></span><div class="tags">Tags: <a class="tag">be-yourself</a> <a class="tag">inspirational</a></div></div>', '<div class="quote"><span class="text">“Try not to become a man of success. Rather become a man of value.”</span><span>by <small class="author">Albert Einstein</small></span><div class="tags">Tags: <a class="tag">adulthood</a> <a class="tag">success</a> <a class="tag">value</a></div></div>', '<div class="quote"><span class="text">“It is better to be hated for what you are than to be loved for what you are not.”</span><span>by <small class="author">André Gide</small></span><div class="tags">Tags: <a class="tag">life</a> <a class="tag">love</a></div></div>', '<div class="quote"><span class="text">“I have not failed. I\'ve just found 10,000 ways that won\'t work.”</span><span>by <small class="author">Thomas A. Edison</small></span><div class="tags">Tags: <a class="tag">edison</a> <a class="tag">failure</a> <a class="tag">inspirational</a> <a class="tag">paraphrased</a></div></div>', '<div class="quote"><span class="text">“A woman is like a tea bag; you never know how strong it is until it\'s in hot water.”</span><span>by <small class="author">Eleanor Roosevelt</small></span><div class="tags">Tags: <a class="tag">misattributed-eleanor-roosevelt</a></div></div>', '<div class="quote"><span class="text">“A day without sunshine is like, you know, night.”</span><span>by <small class="author">Steve Martin</small></span><div class="tags">Tags: <a class="tag">humor</a> <a class="tag">obvious</a> <a class="tag">simile</a></div></div>']

items.pyをカスタマイズ

1
2
3
class QuoteItem(scrapy.Item):
quote = scrapy.Field()
author = scrapy.Field()

quotesjs.pyをカスタマイズ

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# -*- coding: utf-8 -*-
import scrapy
from scrapy_splash import SplashRequest
from scrapy_splash_tutorial.items import QuoteItem

class QuotesjsSpider(scrapy.Spider):
name = 'quotesjs'
allowed_domains = ['quotes.toscrape.com']
start_urls = ['http://quotes.toscrape.com/js/']

def start_requests(self):
for url in self.start_urls:
yield SplashRequest(url, self.parse,
endpoint='render.html',
args={'wait': 0.5},
)

def parse(self, response):
for q in response.css(".container .quote"):
quote = QuoteItem()
quote["author"] = q.css(".author::text").extract_first()
quote["quote"] = q.css(".text::text").extract_first()
yield quote

クローラーを実行する

scrapy crawl quotesjs -o result.jsonでクローラーを実行する。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
$scrapy crawl quotesjs -o result.json
2020-05-06 18:34:02 [scrapy.utils.log] INFO: Scrapy 2.1.0 started (bot: scrapy_splash_tutorial)
2020-05-06 18:34:02 [scrapy.utils.log] INFO: Versions: lxml 4.5.0.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.5.2, w3lib 1.21.0, Twisted 20.3.0, Python 3.8.2 (default, Apr 16 2020, 18:36:10) - [GCC 8.3.0], pyOpenSSL 19.1.0 (OpenSSL 1.1.1g 21 Apr 2020), cryptography 2.9.2, Platform Linux-4.19.76-linuxkit-x86_64-with-glibc2.2.5
2020-05-06 18:34:02 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
2020-05-06 18:34:02 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'scrapy_splash_tutorial',
'DUPEFILTER_CLASS': 'scrapy_splash.SplashAwareDupeFilter',
'EDITOR': '/usr/bin/vim',
'HTTPCACHE_STORAGE': 'scrapy_splash.SplashAwareFSCacheStorage',
'NEWSPIDER_MODULE': 'scrapy_splash_tutorial.spiders',
'ROBOTSTXT_OBEY': True,
'SPIDER_MODULES': ['scrapy_splash_tutorial.spiders']}
2020-05-06 18:34:02 [scrapy.extensions.telnet] INFO: Telnet Password: febe521f79cff551
2020-05-06 18:34:02 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.memusage.MemoryUsage',
'scrapy.extensions.feedexport.FeedExporter',
'scrapy.extensions.logstats.LogStats']
2020-05-06 18:34:02 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy_splash.SplashCookiesMiddleware',
'scrapy_splash.SplashMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2020-05-06 18:34:02 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy_splash.SplashDeduplicateArgsMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2020-05-06 18:34:02 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2020-05-06 18:34:02 [scrapy.core.engine] INFO: Spider opened
2020-05-06 18:34:02 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2020-05-06 18:34:02 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2020-05-06 18:34:02 [py.warnings] WARNING: /usr/local/lib/python3.8/site-packages/scrapy_splash/request.py:41: ScrapyDeprecationWarning: Call to deprecated function to_native_str. Use to_unicode instead.
url = to_native_str(url)

2020-05-06 18:34:03 [scrapy.core.engine] DEBUG: Crawled (404) <GET http://quotes.toscrape.com/robots.txt> (referer: None)
2020-05-06 18:34:03 [scrapy.core.engine] DEBUG: Crawled (404) <GET http://splash:8050/robots.txt> (referer: None)
2020-05-06 18:34:04 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://quotes.toscrape.com/js/ via http://splash:8050/render.html> (referer: None)
2020-05-06 18:34:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://quotes.toscrape.com/js/>
{'author': 'Albert Einstein',
'quote': '“The world as we have created it is a process of our thinking. It '
'cannot be changed without changing our thinking.”'}
2020-05-06 18:34:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://quotes.toscrape.com/js/>
{'author': 'J.K. Rowling',
'quote': '“It is our choices, Harry, that show what we truly are, far more '
'than our abilities.”'}
2020-05-06 18:34:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://quotes.toscrape.com/js/>
{'author': 'Albert Einstein',
'quote': '“There are only two ways to live your life. One is as though '
'nothing is a miracle. The other is as though everything is a '
'miracle.”'}
2020-05-06 18:34:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://quotes.toscrape.com/js/>
{'author': 'Jane Austen',
'quote': '“The person, be it gentleman or lady, who has not pleasure in a '
'good novel, must be intolerably stupid.”'}
2020-05-06 18:34:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://quotes.toscrape.com/js/>
{'author': 'Marilyn Monroe',
'quote': "“Imperfection is beauty, madness is genius and it's better to be "
'absolutely ridiculous than absolutely boring.”'}
2020-05-06 18:34:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://quotes.toscrape.com/js/>
{'author': 'Albert Einstein',
'quote': '“Try not to become a man of success. Rather become a man of value.”'}
2020-05-06 18:34:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://quotes.toscrape.com/js/>
{'author': 'André Gide',
'quote': '“It is better to be hated for what you are than to be loved for '
'what you are not.”'}
2020-05-06 18:34:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://quotes.toscrape.com/js/>
{'author': 'Thomas A. Edison',
'quote': "“I have not failed. I've just found 10,000 ways that won't work.”"}
2020-05-06 18:34:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://quotes.toscrape.com/js/>
{'author': 'Eleanor Roosevelt',
'quote': '“A woman is like a tea bag; you never know how strong it is until '
"it's in hot water.”"}
2020-05-06 18:34:04 [scrapy.core.scraper] DEBUG: Scraped from <200 http://quotes.toscrape.com/js/>
{'author': 'Steve Martin',
'quote': '“A day without sunshine is like, you know, night.”'}
2020-05-06 18:34:04 [scrapy.core.engine] INFO: Closing spider (finished)
2020-05-06 18:34:04 [scrapy.extensions.feedexport] INFO: Stored json feed (10 items) in: result.json
2020-05-06 18:34:04 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 960,
'downloader/request_count': 3,
'downloader/request_method_count/GET': 2,
'downloader/request_method_count/POST': 1,
'downloader/response_bytes': 9757,
'downloader/response_count': 3,
'downloader/response_status_count/200': 1,
'downloader/response_status_count/404': 2,
'elapsed_time_seconds': 2.285135,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2020, 5, 6, 9, 34, 4, 575789),
'item_scraped_count': 10,
'log_count/DEBUG': 13,
'log_count/INFO': 11,
'log_count/WARNING': 1,
'memusage/max': 56578048,
'memusage/startup': 56578048,
'response_received_count': 3,
'robotstxt/request_count': 2,
'robotstxt/response_count': 2,
'robotstxt/response_status_count/404': 2,
'scheduler/dequeued': 2,
'scheduler/dequeued/memory': 2,
'scheduler/enqueued': 2,
'scheduler/enqueued/memory': 2,
'splash/render.html/request_count': 1,
'splash/render.html/response_count/200': 1,
'start_time': datetime.datetime(2020, 5, 6, 9, 34, 2, 290654)}
2020-05-06 18:34:04 [scrapy.core.engine] INFO: Spider closed (finished)

生成されたresult.jsonは以下。

1
2
3
4
5
6
7
8
9
10
11
12
[
{"author": "Albert Einstein", "quote": "\u201cThe world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.\u201d"},
{"author": "J.K. Rowling", "quote": "\u201cIt is our choices, Harry, that show what we truly are, far more than our abilities.\u201d"},
{"author": "Albert Einstein", "quote": "\u201cThere are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.\u201d"},
{"author": "Jane Austen", "quote": "\u201cThe person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.\u201d"},
{"author": "Marilyn Monroe", "quote": "\u201cImperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.\u201d"},
{"author": "Albert Einstein", "quote": "\u201cTry not to become a man of success. Rather become a man of value.\u201d"},
{"author": "Andr\u00e9 Gide", "quote": "\u201cIt is better to be hated for what you are than to be loved for what you are not.\u201d"},
{"author": "Thomas A. Edison", "quote": "\u201cI have not failed. I've just found 10,000 ways that won't work.\u201d"},
{"author": "Eleanor Roosevelt", "quote": "\u201cA woman is like a tea bag; you never know how strong it is until it's in hot water.\u201d"},
{"author": "Steve Martin", "quote": "\u201cA day without sunshine is like, you know, night.\u201d"}
]