Lecture 4 Flashcards

1
Q

Mention some good things about IDEs:

A

Highlighting, Completion, Execution, Debugging

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a version control system (VCS)?

A

Gives concurrent development with multiple people and revisions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are some other (bad solutions) to a version control system?

A

Local Copy on Disk, Zip File via Email and DropBox - downside is: Concurrent edits are not possible, management complicated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does development with multiple people look like?

A

You can be multiple developers, software architects testers and the build server. All of you are working together on the source code

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Again, what do we say software development is?

A

iterative and incremental, bc it involves building software in small, manageable parts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What terms are used when defining a version control system?

A

Version, branch (temporary), merge (development line), tag
Her

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why do we need version control, when we say software development is incremental?

A

Bc this creates multiple versions of the software system. Some of those versions need to be maintained, e.g., when they were released to customers who do not necessarily update to the newest version immediately but still expect updates to the
version they have installed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a merge conflict?

A

When a conflict occurs when attempting to merge with changes that can’t be integrated easily (development line deleted some code).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does version control systems (usually) combine?

A

The management of multiple versions with an
online code repository

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is SVN?

A

SVN (subversion) is a client-server version that has a local working copy of the source code, used for development. When a meaningful set of changes have been done, they can be sent to the remote repository.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Give a step by step of SVN.

A

Import: From the local computer, you send the source code to the remote repository (done only once, for a directory to initialise the repository)

Check out: Initalise the local working copy with data from the repository. This is done once for a directory from the repository, when copying it to the local computer

Commit: Sends changed to the repository to integrate it into the current state fo the reporistory

Update: Retrieves the changed data from the repository where it’s integrated into the current state of the working copy.

Her

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Mention the most relevant VCS, and what kind they are

A

Subversion (SVN), Client-server version control
Git, Distributed version control

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are some SVN clients?

A

Command line: Apache SVN, SlikSVN

IDE (via plug-in):
* Eclipse
* * Subversive
* * Subclipse
* IntelliJ IDEA and NetBeans
* * integrated

External tool: Tortoise SVN, Smart SVN

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are characteristics of SVN (Subversion)?

A
  • Can check-out parts of
    projects (e.g., directories) (not at all how SVN is intended)
  • Renames/moves count as
    new files (lost history)
  • Mediocre support for nonlinear workflows
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are characteristics of Git?

A
  • Distributed
  • Can check out only entire projects (good)
  • Recognizes moves/renames (history maintained)
  • Very good support for nonlinear workflows
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How does Git work?

A

Git is a distributed version control system that has a local repository and doesn’t need network connection. The commits are not visible (before pushing) to other people collaborating remotely. This has the benefit that it fosters frequent commits even if these may only be meaningful to the current developer (and not the entire team).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Write the five most common commands for SVN:

A
  • Import: svn import C:\repository LINK
  • Checkout: svn checkout LINK
  • Add: svn add file.java
  • Commit: svn commit -m “comment”
  • Update: svn update
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Give a step by step of Git.

A

Init: Sending its data to the local repository from the directory. This is done only once, to initialise the repository with the respective data. The data is not yet available, unless it has been pushed.
Clone: Initialises the local repository and local working copy with the data from the remote repository. This is done only once, for the remote repository when copying to the local computer.
Commit: Send changed data to the local repositort and integrates it into the current state of the repository. The data is still not available in the remote repository as it hasn’t been pushed yet.
Push: Send the data from the local repository to the remote repository.
Pull: Sends the data from the remote reposiroty to the local reposiroty where it’s integrated into the current state of the working copy.
Her

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Mention some online SVNs:

A

Rioux SVN, Assembla, Project Locker

20
Q

What is Git branching and merging?

A

In Git you normally start a new branch for developing a particular product, in terms of functionality, then you merge those chnages back into the main development line (main branch).
The checkcout swithces to another branch of development. When used with -b a new branch is created before immediately switching to it.
merge merges the changes of the specified branch into the currently active branch. When wanting to merge into the main branch, you have to change to the main first.

21
Q

Mention the most common comands for Git:

A
  • init: git init
  • clone: git clone LINK
  • add: git add Example.java
  • commit: git commit -m “comment”
  • push: git push
  • pull: git pull
22
Q

What are some popular VCS?

A

CVS (concurrent version system):
* Client server, 1980s
* First popular vcs

SVN:
* Client server, early 2000s
* Successor to CVS, standard of the 2000s and still in use today

BitKeeper:
* Distributed, early 200s
* Used for Linux kernel early 2000s

Git:
* Distributed, mid 2000s
* Standard today and used for Linux kernel today

Mercurial:
* Distributed, mid 2000s
* Adopted by Facebook in 2013

23
Q

What is a branch? Why are they usefull?

A

Each branch represents an independent line of development. Good for: Parallel Development
Feature Integration

24
Q

What is semantic versioning scheme for releasing?

A

It’s the version number of the software’s version. When incrementing to a major version, there migth be disruptive changes.

25
Q

Mention some Git online platforms:

A

Github, Gitlab, Bitbucket

26
Q

Give an example of the versioning scheme for releases?

A

7.15.2891 (major, minor, patch)
Major: Significant new program functionality and results in interfaces that may no longer be compatible.
Minor: New program functionality that is compatible with old functionality and results in interfaces that possibly are extended but are still backward compatible
Patch: Bug fixes and minor internal changes and results in unchanged interfaces

27
Q

What are “releases” in the context of versions?

A

Releases are versions that are provided to customers/the public for use.

28
Q

What are some Big-Data tools?

A
  • Apache HBase
  • Phoenix
  • Druid
  • Hadoop
  • Spark
  • Cassandra
  • Apache Hive
  • Apache Drill
  • Apache ZooKeeper
  • Apache Pig
  • Apache Kafka
  • Samza
  • Mahout
  • ml4j
  • DL4J
  • Weka

Her

29
Q

How many releases does a company usually have? Explain those.

A

Three:
* long-term support (for late adopters that cannot update frequently, e.g., large companies, This release is supposed to still receive critical updates, but no new functionlaties)
* stable (current version, majority of people would use it)
* pre-release (beta/alpha, contains newest feature, has not been tested properly)

Picture that describes it

30
Q

Add the missing terms!
Here

31
Q

What is Hadoop?

A

Primary use: Parallel data processing, from Disk (HDFS)

32
Q

What are the base modules of Hadoop?

A

Common, HDFS (Hadoop Distributed File System), YARN (Yet Another Resource Negotiator), MapReduce

33
Q

What is Spark?

A

Parallel data processing, in RAM

34
Q

Name some data storage tools, and their characteristics.

A

Apache HBase: Distributed, fault tolerant, column-oriented non-relational database on top of HDFS
Apache Phoenix: Distributed relational database engine with SQL support using HBase
Druid: Distributed column-oriented data store for real-time analytics
Cassandra: Distributed wide-column data store for big data (column names can vary per row, NoSQL database for big data)

35
Q

Compare Hadoop and Spark

A
  • Hadoop: From disk (HDFS), HiveQL, Fast, medium cost
  • Spark: In RAM, SparkSQL, Very Fast, high cost
36
Q

What is HDFS (Hadoop Distributed File System)?

A

It’s a storage system that stores on multiple “regular” computers, meaning it’s easy to have many of them, as they’re in different locations. Given this, the overall bandwidth to access the stored data may be significantly higher, compared to using one super computer at site.

37
Q

Mention some Data Query Tools

A

Apache Hive, Apache Drill

38
Q

What is MapReduce?

A

It’s a programming model for large-scale parallel data processing. It’s specialised in split-apply-combine strategy.
Map: filters and sorts data (sorting data by city in queue)
Reduce: perform summary operation (averaging values of city data)

39
Q

Mention some Calculation Tools

A

Apache Pig: High-level platform for creating programs that run on Hadoop
Apache Kafka: Collect and distribute data streams in real-time from/to interested clients
Apache Samza: Develop applications that process streaming data, e.g., from Kafka

40
Q

Name some coordination tools, and their characteristics.

A

Apache ZooKeeper: Centralized service for distributed access to a hierarchical key-value store

41
Q

Name the things in the picture:
Here

A
42
Q

Difference between Git and SVN?

A

SVN is a client-server version control system, where a single central repository is used. Git, on the other hand, is a distributed version control system, where each user has a complete local copy of the repository.

43
Q

Name some machine learning tools, and their characteristics.

A

Apache Mahout: Collection of distributed, scalable machine learning algorithms
ml4j: Machine learning library
DL4J: Distributed deep learning library
WEKA: Data mining through machine learning

44
Q

What does parseInt do?

A

Converts String to int.
(“123” works, but “abc” and 123 doesn’t)

45
Q

What is runtime error and syntax error?

A

Runtime error: wrong use of something (can’t compile because it’s not possible?)
Syntax error: Wrong writing of something

46
Q

Explain these (in the context of this keyword):
A. Refer instance variable
B. Call local constructor
C. Pass object as argument

A

A: We refer to the instance variable outside the body of the method.
B: We call the local constructor that initialised the same parameters as given.
C: We pass “this” as an argument in a method.