DAY1 RECON & OSINT Flashcards
(43 cards)
WHAT IS RECONNAISSANCE
“Recon is the science of gathering information about a target”
T/F Profile a target (a user, a company or any victim) in depth
T
T/F reconnaissance Relies heavily on OSINT
T
Relies heavily on OSINT
Open-Source Intelligence (publicly available information that can prove to be helpful to the attacker)
• Collect and analyze everything that’s “out there” pertaining to the target
» Direct communication with the target mostly does not happen
T/F Reconnaissance Used in various domains, including cybersecurity, law enforcement, business intelligence, and national security
T
PHASES OF HACKING
RECONNAISSANCE STAGE:
-FOOTPRINTING
-SCANNING
-ENUMERATION
SYSTEM HACKING:
-GAINING ACCESS
-MAINTAINING ACCESS
-COVERING TRACKS & BACKDOORING
TYPES OF RECON
› Passive
» Relies heavily on OSINT techniques
» Does not reveal the source of the activity (anonymity)
» Information can be inaccurate or out-of-date
› Active
» Interact with the system directly (tools communicate with the target)
• Direct victim profiling via scanning and enumeration-based invasive techniques
» Information is accurate and up-to-date
» Can reveal the source of the activity (identity is compromised)
TYPES OF RECON
WE WILL SEE BOTH ACTIVE & PASSIVE RECON (OSINT) TOOLS & TECHNIQUES
› Valuable information on web pages
» HTML comments
• Sensitive information left there by developers
» Website Mirroring & Web Spiders/Crawlers
• Technique used to copy publicly available and linked content for offline analysis
» Directory Brute Forcing and Forced Browsing
• Technique used to discover hidden, restricted and unlinked content on the web server
» Google Hacking via Dorks & Advanced Search
• Advanced search queries that return very specific data from websites
» Email Harvesting
• Gathering email addresses on individual victims or potential target in an organization
WEBSITE MIRRORING: HOW DOES IT WORK?
› Site Ripping
» Download the entire website to your local machine for offline browsing
• Retrieves the content that is publicly available or linked on the site
• Does not look for hidden directories or content (no brute force or dictionaries are used)
• Tools follow/explore links/references on the main page and then sub-pages
» Much easier to parse and analyze the website for useful information
• No further interaction with the live site (no need to send repeated requests for content)
› Many tools exist
» PageNest
» BlackWidow
» HTTrack
› Web Spider or Crawler
» Visit website homepage and follow/open links to sub-pages recursively
• Content needs to be publicly accessible for this
» Downloads relevant content in an automated fashion matching pre-
defined search criteria
• Regular expressions
• Certain file extensions (like JPG or PNG)
• Metadata of Office (Word, PowerPoint, Excel) & PDF documents
DIRECTORY BRUTE FORCING: WHAT IS IT?
› Systematically trying different directory and file names to see if they exist on the server.
» Used for accessing hidden, restricted or unlinked content on a website
» Often a contextually relevant list of common directory and file names (dictionary) is used
» For instance, for a university web server, the potential entries in the dictionary could be:
• Academics
• Grades
• Registrar
• Student Affairs
• Courses
» Another option to discover content is to attempt all possible combinations (brute forcing)
• A → Z
• 0 → 9
• Combinations of alphabets and digits to cover the entire space of possible directory and file names
DIRECTORY BRUTE FORCING: MORE TOOLS
Lots of Web Content Scanners
» BurpSmartBuster (plug-in for Burp Suite)
» Dirsearch
» DIRB (available in Kali with built-in dictionaries)
» Cansina (available with BlackArch Linux) – Good one!
» Meg (does not overwhelm the servers)
» Wfuzz (available in Kali with much more functionality)
» Gobuster
FORCED BROWSING: WHAT IS IT?
› Directory brute forcing is a resource-intensive activity (aggressive)
» May trigger security alerts on the target server
› Instead, strategically manipulate URLs to take advantage of vulnerabilities in
the application’s input validation or authorization mechanisms
» Attackers attempt to navigate to directories or resources that should be protected but are
not due to flawed security configurations (improper access control)
» Targeted approach which is more stealthy
» Feroxbuster is useful for forced browsing
• Uses brute forcing as well as wordlists (dictionaries)
GOOGLE DORKS: SEARCH ENGINES
› Advanced Google queries and operators
» cache: Display results from pages stored in Google cache
» link: Display results with links to the specified page
» related: Display similar results
» site: Display results from the queried website only
» intitle: Display results that have searched keywords in title
» inurl: Display results that have searched keywords in the URL
EMAIL HARVESTING: HOW DOES IT WORK?
› Email Harvest
» Gathering emails of potential victims
» Step 1: Guess email IDs because companies have a pattern
• ali.hassan.1@kaust.edu.sa (first initial followed by a dot and the last name)
• Do this for as many users as possible (dictionaries of common names,
employee lists, brute force, etc.)
» Step 2: Send email on the guessed email ID
» Step 3: Analyze the response of the SMTP server
• If email is accepted, add to the database of harvested email IDs
• If email is rejected, discard it (Delivery Status Notification msg)
EMAIL HARVESTING: OTHER OPTIONS
› Spider or Crawler Scans
» Use web crawlers and spiders to go search through the entire website,
forums, blogs, etc., for email addresses
› Search Engines
» Use Google and other search engines to return all email addresses having
a certain suffix, such as “@kaust.edu.sa”
› Email Address Lookup Services
» Hunter.io - https://hunter.io/
» Phonebook.cz - https://phonebook.cz/
» VoilaNorbert - https://www.voilanorbert.com/
RECON: LOCATION DETAILS
› Google Maps & Google Earth
» Used to plot data points and cross-reference with know landmarks, addresses, or publicly available datasets
› OpenStreetMap (OSM) Geographic DB & Wikimapia
» Queryable open-source database with loads of features (geographic encyclopedia)
› Quantum Geographic Info System (QGIS)
» Perform detailed spatial analysis and visualization
› World Imagery Wayback
» A digital archive of different versions of World Imagery created over time (online historical atlas)
› IP Geolocation Services
» Translates IP addresses to the corresponding physical location of a system
› Social Networking Sites
» Users share geolocation tags or hashtags; movement patterns of users can be inferred if they post frequently
› Shodan
» Seach engine for Internet-connected devices that can also provide geographic location based on IP information
› Maltego
» A data mining tool used to connect location data with other OSINT findings
FINDING LOCATION CAN BE TRICKY
› Even if users don’t explicitly share their location, background or minor
details can provide intelligence:
» Landmarks (Eiffel Tower, Burj Khalifa, road signs)
» Shadows and time of day (can help estimate time zone)
» License plates, billboards, or street signs for geographic hints
› Reverse image search can match an image to a known place or location
» Use AI services to enhance image quality
› Explore video reviews and vlogs on YouTube for certain locations to look for clues and information
RECON: EMPLOYEE INFORMATION
› Lots of people-based search engines out there:
» Pipl, snitch.name, That’sThem, Intelius, myLife, etc.
› Provide the following information:
» Biodata (name, age, address, sex etc.)
» Emails
» Social media presence
» Friends
» Preferences/Interests
» Marital status
» Education
» Court records
» Credit history
» And much more
RECON: ARCHIVED INFORMATION
› Wayback Machine is a digital archive (collection) of the Web
› Useful tool for various reconnaissance scenarios
» Uncovering Deleted Information:
• Archived versions can help recover sensitive information that has been removed from
websites
» Tracking Website Evolution:
• By examining how a website has changed over time, attackers can identify the security
patterns and plan accordingly
» Discovering Deprecated APIs and Endpoints:
• In API reconnaissance, Wayback Machine can help identify endpoints or functionalities
that were once publicly accessible but have since been deprecated or hidden
RECON: SOCIAL MEDIA INTELLIGENCE
› Profile information
› Photos and videos
› Friend and connection lists
› Status updates and posts
› Groups and communities
› Check-ins and locations
› Likes and interactions
Also RECON: SOCIAL MEDIA INTELLIGENCE
› Impersonation, Sock Puppets, and Sybil
Identities:
» Assume identity of someone the target knows or
trusts or someone they could easily learn to trust
» A fake online identity or persona is called a sock
puppet or sybil identity
• E.g., a male attacker joining a female-only WhatsApp
group by pretending to be a female
» Hides true identity of the attacker while
simultaneously tricking the victim into revealing sensitive information
TOPOLOGY MAPPING: TRACEROUTING
› Trace the route to a host
› Direct interaction with the victim
› How ‘traceroute’ works:
» Send packet with TTL 1
» First router will receive and drop the packet
» Send packet with TTL 2
» Second router will receive and drop
» Send until max number of hops
› We know the identity of each router from the ICMP response message
SERVICES/APPS REQUIRE PORTS
Services/Apps run on a specific port(s) over a particular protocol
FTP 21 (TCP)
SSH 22 (TCP)
DNS 53 (TCP, UDP)
Vulnerable service software allows hackers to break into a system
Unpatched Web server software
Buggy DNS server software
Etc..
Hence, an important step in reconnaissance is to discover which ports are open