Sunday, 26 April 2020

How to install Scrapy on Ubuntu

$ which python3
/usr/bin/python3

$ python3 --version
Python 3.6.9

$ virtualenv --python=python3.6 venv
Running virtualenv with interpreter /usr/bin/python3.6
Already using interpreter /usr/bin/python3.6
Using base prefix '/usr'
New python executable in /home/bojan/dev/github/scrapy-demo/venv/bin/python3.6
Also creating executable in /home/bojan/dev/github/scrapy-demo/venv/bin/python
Installing setuptools, pip, wheel...
done.

$ source venv/bin/activate

(venv) $ pip install scrapy
Collecting scrapy
  Downloading Scrapy-2.1.0-py2.py3-none-any.whl (239 kB)
Collecting service-identity>=16.0.0
  Downloading service_identity-18.1.0-py2.py3-none-any.whl (11 kB)
Collecting cryptography>=2.0
  Downloading cryptography-2.9.2-cp35-abi3-manylinux2010_x86_64.whl (2.7 MB)
Collecting cssselect>=0.9.1
  Downloading cssselect-1.1.0-py2.py3-none-any.whl (16 kB)
Collecting w3lib>=1.17.0
  Downloading w3lib-1.21.0-py2.py3-none-any.whl (20 kB)
Collecting pyOpenSSL>=16.2.0
  Downloading pyOpenSSL-19.1.0-py2.py3-none-any.whl (53 kB)
Collecting zope.interface>=4.1.3
  Downloading zope.interface-5.1.0-cp36-cp36m-manylinux2010_x86_64.whl (234 kB)
Collecting protego>=0.1.15
  Downloading Protego-0.1.16.tar.gz (3.2 MB)
Collecting parsel>=1.5.0
  Downloading parsel-1.5.2-py2.py3-none-any.whl (12 kB)
Collecting queuelib>=1.4.2
  Downloading queuelib-1.5.0-py2.py3-none-any.whl (13 kB)
Collecting lxml>=3.5.0
  Downloading lxml-4.5.0-cp36-cp36m-manylinux1_x86_64.whl (5.8 MB)
Collecting PyDispatcher>=2.0.5
  Downloading PyDispatcher-2.0.5.tar.gz (34 kB)
Collecting Twisted>=17.9.0
  Downloading Twisted-20.3.0-cp36-cp36m-manylinux1_x86_64.whl (3.1 MB)
Collecting pyasn1-modules
  Using cached pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB)
Collecting pyasn1
  Using cached pyasn1-0.4.8-py2.py3-none-any.whl (77 kB)
Collecting attrs>=16.0.0
  Using cached attrs-19.3.0-py2.py3-none-any.whl (39 kB)
Collecting six>=1.4.1
  Using cached six-1.14.0-py2.py3-none-any.whl (10 kB)
Collecting cffi!=1.11.3,>=1.8
  Downloading cffi-1.14.0-cp36-cp36m-manylinux1_x86_64.whl (399 kB)
Requirement already satisfied: setuptools in ./venv/lib/python3.6/site-packages (from zope.interface>=4.1.3->scrapy) (46.1.3)
Collecting incremental>=16.10.1
  Downloading incremental-17.5.0-py2.py3-none-any.whl (16 kB)
Collecting Automat>=0.3.0
  Downloading Automat-20.2.0-py2.py3-none-any.whl (31 kB)
Collecting hyperlink>=17.1.1
  Downloading hyperlink-19.0.0-py2.py3-none-any.whl (38 kB)
Collecting PyHamcrest!=1.10.0,>=1.9.0
  Downloading PyHamcrest-2.0.2-py3-none-any.whl (52 kB)
Collecting constantly>=15.1
  Downloading constantly-15.1.0-py2.py3-none-any.whl (7.9 kB)
Collecting pycparser
  Downloading pycparser-2.20-py2.py3-none-any.whl (112 kB)
Collecting idna>=2.5
  Using cached idna-2.9-py2.py3-none-any.whl (58 kB)
Building wheels for collected packages: protego, PyDispatcher
  Building wheel for protego (setup.py) ... done
  Created wheel for protego: filename=Protego-0.1.16-py3-none-any.whl size=7765 sha256=7696d4fa63732f3509349e932b23800b7dacc6034fc150eaa3f4448fa401c6aa
  Stored in directory: /home/bojan/.cache/pip/wheels/b2/74/25/517a0ec6186297704db56664268e72686f5cfa8ab398582f33
  Building wheel for PyDispatcher (setup.py) ... done
  Created wheel for PyDispatcher: filename=PyDispatcher-2.0.5-py3-none-any.whl size=11515 sha256=17b011eb905d7eccda1eaafb51833fbe7a1b8406fb4a7764e7216ef19fafd698
  Stored in directory: /home/bojan/.cache/pip/wheels/28/db/61/691c759da06ba9b86da079bdd17cb3e01828d49d5c152cb3af
Successfully built protego PyDispatcher
Installing collected packages: six, pycparser, cffi, cryptography, pyasn1, pyasn1-modules, attrs, service-identity, cssselect, w3lib, pyOpenSSL, zope.interface, protego, lxml, parsel, queuelib, PyDispatcher, incremental, Automat, idna, hyperlink, PyHamcrest, constantly, Twisted, scrapy
Successfully installed Automat-20.2.0 PyDispatcher-2.0.5 PyHamcrest-2.0.2 Twisted-20.3.0 attrs-19.3.0 cffi-1.14.0 constantly-15.1.0 cryptography-2.9.2 cssselect-1.1.0 hyperlink-19.0.0 idna-2.9 incremental-17.5.0 lxml-4.5.0 parsel-1.5.2 protego-0.1.16 pyOpenSSL-19.1.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 pycparser-2.20 queuelib-1.5.0 scrapy-2.1.0 service-identity-18.1.0 six-1.14.0 w3lib-1.21.0 zope.interface-5.1.0

(venv) $ pip list --local
Package          Version
---------------- -------
attrs            19.3.0 
Automat          20.2.0 
cffi             1.14.0 
constantly       15.1.0 
cryptography     2.9.2  
cssselect        1.1.0  
hyperlink        19.0.0 
idna             2.9    
incremental      17.5.0 
lxml             4.5.0  
parsel           1.5.2  
pip              20.0.2 
Protego          0.1.16 
pyasn1           0.4.8  
pyasn1-modules   0.2.8  
pycparser        2.20   
PyDispatcher     2.0.5  
PyHamcrest       2.0.2  
pyOpenSSL        19.1.0 
queuelib         1.5.0  
Scrapy           2.1.0  
service-identity 18.1.0 
setuptools       46.1.3 
six              1.14.0 
Twisted          20.3.0 
w3lib            1.21.0 
wheel            0.34.2 
zope.interface   5.1.0  

(venv) $ pip freeze 
attrs==19.3.0
Automat==20.2.0
cffi==1.14.0
constantly==15.1.0
cryptography==2.9.2
cssselect==1.1.0
hyperlink==19.0.0
idna==2.9
incremental==17.5.0
lxml==4.5.0
parsel==1.5.2
Protego==0.1.16
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.20
PyDispatcher==2.0.5
PyHamcrest==2.0.2
pyOpenSSL==19.1.0
queuelib==1.5.0
Scrapy==2.1.0
service-identity==18.1.0
six==1.14.0
Twisted==20.3.0
w3lib==1.21.0
zope.interface==5.1.0


Verification:

(venv) $ scrapy
Scrapy 2.1.0 - no active project

Usage:
  scrapy <command> [options] [args]

Available commands:
  bench         Run quick benchmark test
  fetch         Fetch a URL using the Scrapy downloader
  genspider     Generate new spider using pre-defined templates
  runspider     Run a self-contained spider (without creating a project)
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project
  version       Print Scrapy version
  view          Open URL in browser, as seen by Scrapy

  [ more ]      More commands available when run from project directory

Use "scrapy <command> -h" to see more info about a command

(venv) $ scrapy bench
2020-04-25 23:57:01 [scrapy.utils.log] INFO: Scrapy 2.1.0 started (bot: scrapybot)
2020-04-25 23:57:01 [scrapy.utils.log] INFO: Versions: lxml 4.5.0.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.5.2, w3lib 1.21.0, Twisted 20.3.0, Python 3.6.9 (default, Apr 18 2020, 01:56:04) - [GCC 8.4.0], pyOpenSSL 19.1.0 (OpenSSL 1.1.1g  21 Apr 2020), cryptography 2.9.2, Platform Linux-4.15.0-96-generic-x86_64-with-Ubuntu-18.04-bionic
2020-04-25 23:57:01 [scrapy.crawler] INFO: Overridden settings:
{'CLOSESPIDER_TIMEOUT': 10, 'LOGSTATS_INTERVAL': 1, 'LOG_LEVEL': 'INFO'}
2020-04-25 23:57:01 [scrapy.extensions.telnet] INFO: Telnet Password: 6642b95b9fd73c04
2020-04-25 23:57:01 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.closespider.CloseSpider',
 'scrapy.extensions.logstats.LogStats']
2020-04-25 23:57:01 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2020-04-25 23:57:01 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2020-04-25 23:57:01 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2020-04-25 23:57:01 [scrapy.core.engine] INFO: Spider opened
2020-04-25 23:57:01 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2020-04-25 23:57:01 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2020-04-25 23:57:03 [scrapy.extensions.logstats] INFO: Crawled 93 pages (at 5580 pages/min), scraped 0 items (at 0 items/min)
2020-04-25 23:57:03 [scrapy.extensions.logstats] INFO: Crawled 157 pages (at 3840 pages/min), scraped 0 items (at 0 items/min)
2020-04-25 23:57:04 [scrapy.extensions.logstats] INFO: Crawled 230 pages (at 4380 pages/min), scraped 0 items (at 0 items/min)
2020-04-25 23:57:06 [scrapy.extensions.logstats] INFO: Crawled 301 pages (at 4260 pages/min), scraped 0 items (at 0 items/min)
2020-04-25 23:57:07 [scrapy.extensions.logstats] INFO: Crawled 365 pages (at 3840 pages/min), scraped 0 items (at 0 items/min)
2020-04-25 23:57:08 [scrapy.extensions.logstats] INFO: Crawled 422 pages (at 3420 pages/min), scraped 0 items (at 0 items/min)
2020-04-25 23:57:09 [scrapy.extensions.logstats] INFO: Crawled 485 pages (at 3780 pages/min), scraped 0 items (at 0 items/min)
2020-04-25 23:57:10 [scrapy.extensions.logstats] INFO: Crawled 541 pages (at 3360 pages/min), scraped 0 items (at 0 items/min)
2020-04-25 23:57:11 [scrapy.extensions.logstats] INFO: Crawled 597 pages (at 3360 pages/min), scraped 0 items (at 0 items/min)
2020-04-25 23:57:12 [scrapy.core.engine] INFO: Closing spider (closespider_timeout)
2020-04-25 23:57:12 [scrapy.extensions.logstats] INFO: Crawled 646 pages (at 2940 pages/min), scraped 0 items (at 0 items/min)
2020-04-25 23:57:12 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 301439,
 'downloader/request_count': 661,
 'downloader/request_method_count/GET': 661,
 'downloader/response_bytes': 2092067,
 'downloader/response_count': 661,
 'downloader/response_status_count/200': 661,
 'elapsed_time_seconds': 10.596581,
 'finish_reason': 'closespider_timeout',
 'finish_time': datetime.datetime(2020, 4, 25, 22, 57, 12, 551514),
 'log_count/INFO': 20,
 'memusage/max': 55128064,
 'memusage/startup': 55128064,
 'request_depth_max': 22,
 'response_received_count': 661,
 'scheduler/dequeued': 661,
 'scheduler/dequeued/memory': 661,
 'scheduler/enqueued': 13220,
 'scheduler/enqueued/memory': 13220,
 'start_time': datetime.datetime(2020, 4, 25, 22, 57, 1, 954933)}

2020-04-25 23:57:12 [scrapy.core.engine] INFO: Spider closed (closespider_timeout)

2 comments:

Shayzee said...

Hello Everyone !

USA SSN Leads/Fullz available, along with Driving License/ID Number with good connectivity.

All SSN's are Tested & Verified.

**DETAILS IN LEADS/FULLZ**

->FULL NAME
->SSN
->DATE OF BIRTH
->DRIVING LICENSE NUMBER
->ADDRESS WITH ZIP
->PHONE NUMBER, EMAIL
->EMPLOYEE DETAILS

*Price for SSN lead $2
*You can ask for sample before any deal
*If you buy in bulk, will give you discount
*Sampling is just for serious buyers

->Hope for the long term business
->You can buy for your specific states too

**Contact 24/7**

Whatsapp > +923172721122

Email > leads.sellers1212@gmail.com

Telegram > @leadsupplier

ICQ > 752822040

Fixit said...

Hello all
am looking few years that some guys comes into the market
they called themselves hacker, carder or spammer they rip the
peoples with different ways and it’s a badly impact to real hacker
now situation is that peoples doesn’t believe that real hackers and carder scammer exists.
Anyone want to make deal with me any type am available but first
I‘ll show the proof that am real then make a deal like

Available Services

..Wire Bank Transfer all over the world

..Western Union Transfer all over the world

..Credit Cards (USA, UK, AUS, CAN, NZ)

..School Grade upgrade / remove Records

..Spamming Tool

..keyloggers / rats

..Social Media recovery

.. Teaching Hacking / spamming / carding (1/2 hours course)

discount for re-seller

Contact: 24/7

fixitrogers@gmail.com