Research Overview: Aaron Hertzmann

Broadly speaking, I am interested in all areas of computer graphics and computer vision. My work seeks to address the following high-level questions:

Computer Graphics: What powerful tools can we provide to artists, designers, scientists, and novice users for creating beautiful, expressive, artistic, and/or illustrative imagery and animation?
Computer Vision: How can we visually understand the world, extract meaning from images, and model the human visual system?

Moreover, I am interested in applications of Machine Learning to these two areas, as well as applications to Human-Computer Interaction.

Some specific research areas:

Image and video stylization
How can we write computer software that helps in creating artistic imagery and video? How can we enable computer animation in the styles of human painting and drawing?


Painterly rendering	Painterly video	Image Analogies	Paint By Relaxation

Curve Analogies	Segmentation-Based NPR	Interactive painterly animation	PortraitSketch

Controlling Neural Style	Im2Pencil	Multidomain Stylization	Video Rigidification

Stylization of 3D models
Line drawing and occluding contour algorithms for 3D models, for stylization and art.


Illustrating smooth surfaces	Artistic Stroke Thickness	Learning Hatching	Computing smooth contours

Line Drawings from 3D	Neural Contours	Neural strokes	Accurate Occluding Contours

Algebraic Contours	Contour Insights	Region-based simplification

Human vision science of art and photography
See also my blog


The Science of Art	Visual Indeterminacy	Why Do Line Drawings Work?	Edges in Drawing Perception

Toward modeling creativity	Choices in Photography	Theory of perspective	Generative models for psychology

Perspective in my drawings	Per-fixation perspective

Art and AI, Essays
See also my blog


Can Computers Create Art?	Computers Do Not Make Art	Generative AI

Graphic Design and Data-Driven Aesthetics


Learning Label Layout	Color Compatibility	Learning Single-Page Layout	Color personalization

Clip art style similarity	Font attributes	Recognizing Image Style	Infographics style

Interactive layout suggestions	Illustration datasets	Visual design importance	Behance Artistic Media

Stroke-Based Fonts	Context-Aware Asset Search	LayoutGAN	Visual Font Pairing

Design and photo importance	Quantifying art ambiguity	Indeterminate aesthetics

Learning Human Motion Models from Data
How can we create virtual characters from live human performance data?


Style machines	Style IK	Learning biomechanics	Motion Composition

Shared Latent Gaussian Processes	Gaussian Process Dynamical Models	Style-Content Gaussian Processes	Active learning for mocap

Controllers for simulated locomotion
Techniques for creating human and animal motor controllers that move in physically-realistic and expressive ways, inspired by insights from biology, robotics, and reinforcement learning.


Prioritized Optimization	Optimizing Walking	Walking with Uncertainty	Feature-Based Controllers

Low-Dimensional Planning	Full-Body Spacetime	Rotational control

Person tracking and reconstruction from video
How do we perceive the 3D structure of a video sequence that contains moving people and objects?


Non-rigid SFM	Automatic non-rigid shape from video	Rotoscoping	Kinematic person tracking

Physics-Based Person Tracking	Hand reconstruction	Contact and dynamics estimation	Contact-aware retargeting

HuMoR motion model

Machine learning for geometry processing


Surface Texture Synthesis	Learning body shape variation	Real-Time Curvature	Learning mesh segmentation

Furniture style	Metric Regression Forests	Learning segmentation from scraping

Virtual reality video and interfaces


VR Video Editing	VR Video Review	Depth Conflict Resolution	VR Widgets

6-DoF VR video	View-Dependent VR Video

Rigid shape reconstruction


Smooth surfaces from video	Example-based photometric stereo	Example-based multiview stereo	Scanning with varying BRDFs

Image-Based Remodeling

Image Understanding, Photo Editing, GANs


Single-image deblurring	Image Sequence Geolocation	Acceptable photographic adjustments	Deep image tagging

Portrait Segmentation	GAN projection	GANSpace	ZoomShop

MADCoW: Marginal distortion correction

Other machine learning and data science


Segmental speech processing	Latent Factor Travel Model	Sparse Gaussian Processes	Event sequence visualization

Behance recommendations	Toward Better User Studies	Curse of User Studies	Benchmark suites for AI fairness

Attribution for text-to-image

Aaron Hertzmann