Parcial 3 Flashcards
Estudiar (95 cards)
Is a collection of facts, numbers, words, observations or other useful information. Through _____ processing and ____ analysis, organizations transform raw data points into valuable insights that improve decision-making and drive better business outcomes.
Data
Consists of values that can be measured
numerically. Examples of this type of data include discrete data points (such as the number of products sold) or continuous data
points (such as temperature or revenue figures). Is often structured, making it easy to analyze using mathematical tools and algorithms.
Quantitative Data
Is descriptive and non- numerical, capturing characteristics, concepts or experiences that numbers cannot measure. Examples include customer feedback, product reviews and social media comments. This type of data can be structured (such as coded survey responses) or unstructured (such as free-text responses or interview transcripts).
Qualitative Data
Is organized in a clear, defined format, often stored in relational databases or spreadsheets. It can consist of both quantitative (such as sales figures) and qualitative data (such as categorical labels like “yes or no”)
Structured Data
Lacks a strictly defined format. It often comes in complex forms such as text documents, images and videos. _______________ can include both qualitative information (such as customer comments) and quantitative elements (such as numerical values embedded in text).
Unstructured Data
__________________ blends elements of structured and unstructured data. It doesn’t follow a rigid format but can include tags or markers that make it easier to organize and analyze. Examples of this type of data include XML files and JSON objects. Is widely used in
scenarios such as web scraping and data
integration projects because it offers flexibility
while retaining some structure for search and
analysis.
Semi-Structured Data
Is data about data. In other words, it is information about the attributes of a data point or data set, such as file names, authors, creation dates or data types.
Metadata
Refers to massive, complex data sets that traditional systems can’t handle. It includes both structured and unstructured data from sources such as sensors, social media and transactions.
Big Data
Helps organizations process and analyze these large data sets to systematically extract valuable insights. It often requires advanced tools such as machine learning
Big Data Analytics
The ___________ refer to five fundamental characteristics that describe the challenges of handling vast amounts of data.
5 Vs In Big Data
Refers to the massive amount of data generated every second. From social media to commercial transactions, every action contributes to the __________ of Big Data.
First V: Volume
Refers to the speed at which these data are generated and processed. In a world where information is power, speed is essential. Higher _________ allows companies to react in real time to emerging trends or issues, which can provide a significant competitive advantage.
Second V: Velocity
For instance, financial trading platforms process millions of transactions per second, requiring high-speed data processing. Also, sensors in autonomous vehicles generate gigabytes of data per second that must be processed in real time to make navigation decisions.
Examples of “Velocity”
A social network like Facebook generates terabytes of data every day through photos, status updates, and messages that users share. Imagine the volume of data that entails. On the other hand, Walmart, one of the largest retail chains, handles more than 1 million customer transactions per hour, which translates into large volumes of data.
Examples of “Volume”
Refers to the different types of data, such as structured, unstructured, and semi-structured, that can be processed and analyzed. This V allows a more comprehensive and enriching understanding of the environment by considering multiple perspectives and sources of information.
Third V: Variety
For a practical example, we could say that companies can collect data from various sources such as texts, images, sounds, transaction logs, emails, etc. Also, a hospital may have structured data like medical records, and unstructured data like doctors’ notes and medical imaging results.
Examples of “Variety”
Refers to the quality and accuracy of the data. The data must be accurate and reliable to obtain valid insights. It’s essential for making informed decisions and avoiding erroneous conclusions that can be costly.
Fourth V: Veracity
Veracity can be a challenge on social media, where information can be incorrect or misleading. In the medical field, incorrect or incomplete data can have severe consequences, making it crucial to ensure the veracity of the data.
Examples of “Veracity”
Refers to the usefulness and importance of the data and how they can be used to gain benefits and insights. The ______ of the data lies in how they can be used to improve decision-making, optimize processes, and generate new opportunities.
Fifth V: Value
Giants like Netflix or Amazon assign superior utility to data. Netflix, for example, uses Big Data to analyze user preferences and recommend movies and series, creating value through a better user experience. On the other hand, Amazon uses Big Data analytics to optimize its logistics and supply chain, resulting in faster delivery and better customer service.
Examples of “Value”
Data enables organizations to transform raw information into actionable insights to predict customer behavior, optimize supply chains and fuel innovation. The term “data” comes from the plural of “datum”, a Latin word meaning “something given”: a definition that remains just as relevant today. Every day, millions of people provide data to businesses through interactions such as impressions, clicks, transactions, sensor readings or even just browsing online.
Why Data Is Important
Organizations across industries use data for various purposes, including improving decision-making, streamlining operations and driving innovation.
How Data Is Used
Is a branch of advanced analytics that predicts future trends and outcomes using historical data combined with statistical modeling, data mining and machine learning.
Predictive Analytics
Sometimes called gen AI, is artificial intelligence (AI) that can create original content—such as text, images, video, audio or software code—in response to a user’s prompt or request. ___________ relies on sophisticated machine learning models called deep learning models. These models are trained on vast data sets, which allows them to do things such as understand users’ requests, generate personalized marketing content and write code.
Generative AI