Robotics week7 Flashcards
Robot Learning / Interactive-Systems (18 cards)
In robot learning, what distinguishes supervised learning from unsupervised learning?
Supervised learning uses labeled input–output pairs to train a mapping (e.g., sensor→motor), while unsupervised learning finds patterns or clusters without any labels.
What is the core idea behind reinforcement learning (RL) in robotics?
: The robot learns to choose actions that maximize cumulative future reward through trial-and-error interactions with its environment.
In RL, what roles do the “actor” and “critic” networks perform?
The actor proposes candidate actions based on state observations; the critic evaluates those actions by estimating their expected cumulative reward.
What is a key advantage of model-based RL over model-free RL?
Model-based RL can simulate outcomes using a learned model of the environment, reducing the need to explore risky or low-reward states in the real world.
How does adding a negative “living reward” affect an RL agent’s learned behavior?
It penalizes each time step to encourage efficiency (e.g., reaching a goal with fewer steps), rather than just maximizing final reward.
Why is “formal verification” particularly difficult for policies learned by deep neural networks in RL?
Because neural networks are black boxes—internal weights and layers are not easily interpretable, making it hard to predict behavior under novel conditions
What is a “living reward” in reinforcement learning?
A per-step reward (positive or negative) given to the agent at each time step to shape ongoing behavior (e.g., negative for low battery).
In a Braitenberg vehicle for obstacle avoidance, which whisker-to-motor connections (ipsilateral vs. contralateral, + vs. –) cause the robot to turn away from an object?
Contralateral inhibitory connections: when one whisker senses contact, it inhibits the opposite motor, causing the vehicle to steer away.
In RL, what trade-off does an agent face when deciding between exploration and exploitation?
Whether to explore new actions to discover their rewards (exploration) or exploit known high-reward actions (exploitation) to maximize return.
Why is labelled data often expensive or time-consuming to obtain for supervised robot learning?
Because it requires human annotation or demonstration of correct input–output pairs (e.g., pairing sensor readings with desired motor commands).
What type of interaction does “RWI” describe?
Robot–World Interaction: how a robot physically or functionally interacts with its environment (e.g., object manipulation, soft robotics).
What characterizes a 1st-order unity in interactive systems?
An autonomous self-maintaining entity (organism or machine) whose internal and external processes are structurally coupled to an environment.
How do 2nd-order unities differ from 1st-order unities?
In 2nd-order unities, two (or more) 1st-order entities become co-dependent—each influences and maintains the other’s structure through reciprocal interactions.
What additional capability emerges in a 3rd-order unity?
Coupling between cognitive unities that enables coordination, communication, and ultimately language or social systems.
What is “structural coupling” in the context of interacting unities?
The continuous mutual influence and adaptation between two autonomous systems and their shared environment, leading to co-evolution of their structures.
What role do “skins” play in interactive robots?
Skins are the external embodiments or visual facades that mediate perception and social engagement.
What distinguishes passive interaction from active interaction in a robotics context?
Passive interaction involves only sensing the environment(no energy emission), while active interaction involves emitting energy and sensing its response.
How do “joint attention” and “joint action” relate in human–robot interaction?
Joint attention is two or more agents focusing on the same object or event; joint action is coordinating actions toward a shared goal. Both require coupling and communication.