Chapter 12: Networked Programs Flashcards

1
Q

A Python library for parsing HTML documents and extracting data from HTML documents that compensates for most of the imperfections in the HTML that browsers generally ignore. You can download the code from www.crummy.com.

A

BeautifulSoup

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A number that generally indicates which application you are contacting when you make a socket connection to a server. As an example, web traffic usually uses ____ 80 while email traffic uses ____ 25.

A

port

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When a program pretends to be a web browser and retrieves a web page, then looks at the web page content. Often programs are following the links in one page to find the next page so they can traverse a network of pages or a social network.

A

scrape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A network connection between two applications where the applications can send and receive data in either direction.
module must be imported

A

socket

import socket

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The act of a web search engine retrieving a page and then all the pages linked from a page and so on until they have nearly all of the pages on the Internet which they use to build their search index.

A

spider

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A set of precise and predetermined rules in hardware and software that determine how data is transmitted between different devices in the same network. Takes large scales processes and breaks them down into smaller functions so that devices can communicate.

A

protocol

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Internet protocol that defines how data is transmitted over the internet and determines how web servers and browsers should respond to commands.

data needs to be sent as bytes objects, not strings

A

HTTP(S)
hypertext transfer protocol (secure)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

syntax to signify EOL (end of line)
syntax to create blank line

A

\r\n (EOL)
\r\n\r\n (blank line)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

technique to receive data from socket in 512-character chunks and prints out data until no more to read (aka recv() returns empty string

A

while True:
data = mysock.recv(512)
if len(data) < 1:
break
print(data.decode(),end=’’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

socket method to convert strings into bytes objects

A

.encode()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

socket method to convert bytes objects to strings

A

.decode()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

notation to convert strings to bytes objects

A

b’ ‘
eg. b’Hello World’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

method to set amount of time to wait before calling for more data in order to let the server catch up

A

time.sleep(0.25)
= wait 0.25 seconds between calls

import time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The pausing of either the sending application or the receiving application

A

flow control

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

python library that treats a web page like a file

A

urllib

import urllib.request

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

command to open a web page with urllib
needs decode method

A

fhand = urllib.request.urlopen(filename)
for line in fhand:
print(line.decode().strip())

17
Q

urllib module that defines functions and classes which help in opening URLs

A

urllib.request

18
Q

urllib module that processes URL into components

A

urllib.parse

19
Q

urllib module that defines the exception classes for exceptions raised by urllib.request.

A

urllib.error

20
Q

opens a binary file for writing only

A

‘wb’

21
Q

module that allows program to access websites that strictly enforce HTTPS

A

ssl

import ssl

22
Q

file method that returns HTML source code as bytes object instead of returning an HTTPResponse object

A

.read()

23
Q

Unix/Linux commands used to retrieve web pages and remote files

A

curl
wget