Publications


MANUSCRIPTS

Bi, W., Shah, A. D., Wong, K. W., Scholl, B. J., & Yildirim, I. (under review). Computational models reveal that intuitive physics underlies visual processing of soft objects.

[Show abstract]

Computational explorations of human cognition have been especially successful when applied to visual perception. Existing models have primarily focused on rigid objects, emphasizing shapepreserving invariance to changes in viewpoint, lighting, object size, and scene context. Yet many objects in our everyday environments, such as cloths, are soft. This poses both quantitatively greater and qualitatively different challenges for models of perception, due to soft objects’ dynamic and high-dimensional internal structure — as in the changing folds and wrinkles of a cloth waving in the wind. Soft object perception is also correspondingly rich, involving novel properties such as stiffness. Here we explore the ability of different kinds of computational models to capture human visual perception of the physical properties of texture-equated cloths (e.g., their degrees of stiffness) that are undergoing different naturalistic transformations (e.g., falling vs. waving in the wind). Across visual matching tasks, both the successes and failures of human performance are well explained by Woven — a novel model that incorporates physics-based simulations to infer probabilistic representations of cloths. In contrast, competing models that are calibrated to equal performance as Woven on objective measures — including Woven ablations and a deep neural network — fail to capture human performance. We also confirm a novel prediction of Woven in additional analysis of our data. We suggest that humanlike machine vision may also require representations that transcend image features, and involve intuitive physics.

Ongchoco, J. D. K., Wong, K. W., & Scholl, B. J. (submitted). The “unfinishedness” of dynamic events is spontaneously extracted in visual perception: A new ‘Visual Zeigarnik Effect’.

[Show abstract]

The events that occupy our thoughts in an especially persistent way are often those that are unfinished — half-written papers, unfolded laundry, and items not yet crossed off from to-do lists. And this factor has also been emphasized in work within higher-level cognition, as in the "Zeigarnik effect": when people carry out various tasks, but some are never finished due to extrinsic interruptions, memory tends to be better for those tasks that were unfinished. But just how foundational is this sort of "unfinishedness" in mental life? Might such unfinishedness be spontaneously extracted and prioritized even in lower-level visual processing? To explore this, we had observers watch animations in which a dot moved through a maze, starting at one disc (the 'startpoint') and moving toward another disc (the 'endpoint'). We tested the fidelity of visual memory by having probes (colored squares) appear briefly along the dot's path; after the dot finished moving, observers simply had to indicate where the probes had appeared. On 'Completed' trials, the motion ended when the dot reached the endpoint, but on 'Unfinished' trials, the motion ended shortly before the dot reached the endpoint. Although this manipulation was entirely task-irrelevant, it nevertheless had a powerful influence on visual memory: observers placed probes much closer to their correct locations on Unfinished trials. This same pattern held across several different experiments, even while carefully controlling for various lower-level properties of the displays (such as the speed and duration of the dot's motion). And the effect also generalized across different types of displays (e.g. also replicating when the moving dot left a visible trace). This new type of Visual Zeigarnik Effect suggests that the unfinishedness of events is not just a matter of higher-level thought and motivation, but can also be extracted as a part of visual perception itself.

Wong, K. W., Shah, A. D. & Scholl, B. J. (in prep). Unconscious intuitive physics: Prioritized breakthrough into visual awareness for physically unstable block towers.

[Show abstract]

A central goal of perception and cognition is to predict how events in our local environments are likely to unfold: what is about to happen? And of course some of the most reliable ways of answering this question involve considering the regularities of physics. Accordingly, a great deal of recent research throughout cognitive science has explored the nature of ‘intuitive physics’. The vast majority of this work, however, has involved higher-level reasoning, rather than seeing itself — as when people are asked to deliberate about how objects might move, in response to explicit questions (“Will it fall?”). Here, in contrast, we ask whether the apprehension of certain physical properties of scenes might also occur *unconsciously*, during simple passive viewing. Moreover, we ask whether certain physical regularities are not just processed, but also visually *prioritized* — as when a tower is about to fall. Observers viewed block towers — some stable, some unstable — defined in terms of whether they would collapse as a result of external physical forces (such as gravity) alone. We used continuous flash suppression (CFS) to render the towers initially invisible: observers viewed them monocularly through a mirror haploscope, while a dynamic Mondrian mask was presented to their other eye. We then measured how long towers took to break through this interocular suppression, as observers indicated when they became visually aware of anything other than the mask. The results were clear and striking: unstable towers broke into visual awareness faster than stable towers. And this held even while controlling for other visual properties — e.g. while contrasting pairs of stable vs. unstable towers sharing the same convex hull, and differing only in the horizontal placement of a single block. This work shows how physical instability is both detected and prioritized, not only during overt deliberation, but also in unconscious visual processing.

JOURNAL ARTICLES

Wong, K. W., & Scholl, B. J. (2024). Spontaneous path tracing in task-irrelevant mazes: Spatial affordances trigger dynamic visual routines. Journal of Experimental Psychology: General. 153(9), 2230-2238.
[DOI] [PDF]

Ongchoco, J. D. K., Wong, K. W., & Scholl, B. J. (2024). What's next?: Time is subjectively dilated not only for 'oddball' events, but also for events immediately after oddballs. Attention, Perception, & Psychophysics, 86(1), 16-21.
[DOI] [PDF]

Wong, K. W., Bi, W., Soltani, A. A., Yildirim, I., & Scholl, B. J. (2023). Seeing soft materials draped over objects: A case study of intuitive physics in perception, attention, and memory. Psychological Science, 34(1), 111-119.
[DOI] [PDF]

Wong, K., Wadee, F., Ellenblum, G., & McCloskey, M. (2018). The devil's in the g-tails: Deficient letter-shape knowledge and awareness despite massive visual experience. Journal of Experimental Psychology: Human Perception and Performance, 44(9), 1324-1335.
[DOI] [PDF] [Video Summary]

Conference Talks & Presentations

Wong, K. W., Shah, A. D., & Scholl, B. J. (2024). Unconscious intuitive physics: Prioritized breakthrough into visual awareness for physically unstable block towers. Talk to be given at the annual meeting of the Vision Sciences Society, 5/18/24, St. Pete Beach, FL.

[Show abstract]

A central goal of perception and cognition is to predict how events in our local environments are likely to unfold: what is about to happen? And of course some of the most reliable ways of answering this question involve considering the regularities of physics. Accordingly, a great deal of recent research throughout cognitive science has explored the nature of ‘intuitive physics’. The vast majority of this work, however, has involved higher-level reasoning, rather than seeing itself — as when people are asked to deliberate about how objects might move, in response to explicit questions (“Will it fall?”). Here, in contrast, we ask whether the apprehension of certain physical properties of scenes might also occur *unconsciously*, during simple passive viewing. Moreover, we ask whether certain physical regularities are not just processed, but also visually *prioritized* — as when a tower is about to fall. Observers viewed block towers — some stable, some unstable — defined in terms of whether they would collapse as a result of external physical forces (such as gravity) alone. We used continuous flash suppression (CFS) to render the towers initially invisible: observers viewed them monocularly through a mirror haploscope, while a dynamic Mondrian mask was presented to their other eye. We then measured how long towers took to break through this interocular suppression, as observers indicated when they became visually aware of anything other than the mask. The results were clear and striking: unstable towers broke into visual awareness faster than stable towers. And this held even while controlling for other visual properties — e.g. while contrasting pairs of stable vs. unstable towers sharing the same convex hull, and differing only in the horizontal placement of a single block. This work shows how physical instability is both detected and prioritized, not only during overt deliberation, but also in unconscious visual processing.

Wong, K. W., & Scholl, B. J. (2023). What memories are formed by dynamic 'visual routines'? Poster presented at the annual meeting of the Vision Sciences Society, 5/22/23, St. Pete Beach, FL.

[Show abstract]

You can readily see at a glance how two objects spatially relate to each other. But seeing how 20 objects all relate seems impossible, due to computational explosion (with 190 pairs). Such situations require visual routines: dynamic visual procedures that efficiently compute various properties 'on demand' — e.g. whether two points lie on the same winding path, in a busy scene containing many points and paths ('path tracing'). Some surprisingly foundational questions about visual routines remain unexplored, including: what (if anything) remains in visual memory after the execution of a visual routine? Does path tracing result in a memory of the traced path itself? Or just of whether there was a path? Or nothing at all, after the moment has passed? We explored this for spontaneous path tracing in 2D mazes. Observers saw a maze in which two probes appeared in positions connected by a path. They were then shown two mazes, and had to select which was the initially presented maze. Across experiments, the incorrect maze could be (1) a Path-Obstruction maze, where a new contour blocked the initial inter-probe path; (2) an Irrelevant-Obstruction maze, where a new contour was introduced elsewhere; or (3) an Alternative-Path maze, where the same new Path-Obstruction contour was accompanied by the removal of an existing contour, providing an alternative inter-probe path. Performance on Path-Obstruction trials was much better than on Irrelevant-Obstruction trials (always controlling for lower-level contour properties across trial types). But Alternative-Path trials entirely eliminated this advantage. This suggests that a visual memory is formed by spontaneous path tracing, but that its content is not the path itself, but only whether a path existed. If visual routines exist to answer on-demand questions during perception, then the resulting memories may consist only of the answers themselves, and not the processing that generated them.

Dhar, P., Ongchoco, J. D. K., Wong, K. W., & Scholl, B. J. (2023). Somehow, everything has changed: Event boundaries defined only by unnoticed changes in implicit visuospatial statistics drive active forgetting in visual working memory. Poster presented at the annual meeting of the Vision Sciences Society, 5/20/23, St. Pete Beach, FL.

[Show abstract]

Visual memories can fade not only due to interference and decay, but also due to ‘active forgetting’. Perhaps the most salient example of this involves visual event segmentation: both recognition and recall decline when observers experience event boundaries (e.g. when a visual feature suddenly changes, or when they see themselves pass through a doorway while walking down a long hallway). Such effects are often assumed to be adaptive: event boundaries are taken as cues that the statistics of the world are likely to have changed, rendering pre-boundary memories obsolete. In previous work, however, the event boundaries have always been explicit, with pre- and post-boundary stimuli having similar or identical visual statistics. Here we reversed this pattern: is active forgetting triggered even by completely unnoticed changes in implicit visual statistics, without any overt segmentation cues? Subjects viewed a list of pseudowords for 5 seconds, and later their recognition memory was tested. Critically, they viewed a sequence of images between study and test that either did or did not contain an event boundary defined purely by changes in implicit statistics. Inspired by studies of visual statistical learning, images consisted of differently colored dots positioned within a 3x3 grid. Images contained spatial regularities in the dots’ relative positions despite randomized absolute positioning (such that a red dot was always directly above a blue dot, e.g.). For some subjects, these spatial statistics remained constant; for others, they changed midway through the sequence. Even when subjects were unaware of the implicit statistical patterns, those patterns still influenced resulting memory performance — with impaired recognition (as measured by d’) for subjects who viewed sequences with a change in statistics. Thus, active forgetting due to event segmentation does not depend on observers consciously noticing event boundaries, but rather reflects the underlying architecture of visual working memory.

Ongchoco, J. D. K., Wong, K. W., & Scholl, B. J. (2023). The 'unfinishedness' of dynamic events is spontaneously extracted in visual processing: A new 'Visual Zeigarnik Effect'. Talk presented at the annual meeting of the Vision Sciences Society, 5/23/23, St. Pete Beach, FL.

[Show abstract]

The events that occupy our thoughts in an especially persistent way are often those that are unfinished -- half-written papers, unfolded laundry, and items not yet crossed off from to-do lists. And this factor has also been emphasized in work within higher-level cognition, as in the "Zeigarnik effect": when people carry out various tasks, but some are never finished due to extrinsic interruptions, memory tends to be better for those tasks that were unfinished. But just how foundational is this sort of "unfinishedness" in mental life? Might such unfinishedness be spontaneously extracted and prioritized even in lower-level visual processing? To explore this, we had observers watch animations in which a dot moved through a maze, starting at one disc (the 'startpoint') and moving toward another disc (the 'endpoint'). We tested the fidelity of visual memory by having probes (colored squares) appear briefly along the dot's path; after the dot finished moving, observers simply had to indicate where the probes had appeared. On 'Completed' trials, the motion ended when the dot reached the endpoint, but on 'Unfinished' trials, the motion ended shortly before the dot reached the endpoint. Although this manipulation was entirely task-irrelevant, it nevertheless had a powerful influence on visual memory: observers placed probes much closer to their correct locations on Unfinished trials. This same pattern held across several different experiments, even while carefully controlling for various lower-level properties of the displays (such as the speed and duration of the dot's motion). And the effect also generalized across different types of displays (e.g. also replicating when the moving dot left a visible trace). This new type of Visual Zeigarnik Effect suggests that the unfinishedness of events is not just a matter of higher-level thought and motivation, but can also be extracted as a part of visual perception itself.

Shah, A. D., Wong, K. W., Ilker, Y., & Scholl, B. J. (2023). Perceiving precarity (beyond instability) in block towers. Poster presented at the annual meeting of the Vision Sciences Society, 5/23/23, St. Pete Beach, FL.

[Show abstract]

Intuitive physics has traditionally been associated with higher-level cognition, but recent work has also focused on the exciting possibility that properties such as physical stability may be rapidly and spontaneously extracted as a part of seeing itself — as when you look at a tower of blocks, and can appreciate at a glance that it is about to topple. Much of this work has contrasted towers that appear stable vs. unstable, in terms of whether they would fall as a result of external physical forces (such as gravity) alone. But the 'perception of physics' in block towers seems richer than a binary stable/unstable state. Even when a tower is (and appears to be) stable, for example, we might still readily perceive how precarious it is — in terms of how much force would be required in order to knock it over. Here we explored perceived 'precariousness' using change detection. Observers viewed pairs of block-tower images (one at a time, separated by a mask), and simply reported whether the second image was different. The towers were always stable, but could be differentially precarious. On More-Precarious trials, a single block was shifted slightly so that the tower became less resistant to falling (as quantified by physics-based simulations with variable amounts of spatial jitter). On corresponding Less-Precarious trials, that same block was shifted slightly so that the tower became more resistant to falling. We expected greater attention to (and memory for) changes that introduced a greater likelihood of collapse. But we obtained exactly the opposite pattern: observers were far better at detecting changes on Less-Precarious trials, compared to More-Precarious trials. We explore the possibility that this surprising result may be explained by the 'perception of history', in terms of appreciating how such towers were constructed in the first place.

Wong, K. W., & Scholl, B. J. (2022). Spatial affordances can automatically trigger dynamic visual routines: Spontaneous path tracing in task-irrelevant mazes. Talk presented at the annual meeting of the Vision Sciences Society, 5/14/22, St. Pete Beach, FL.

[Show abstract]

Visual processing usually seems both incidental and instantaneous. But imagine viewing a jumble of shoelaces, and wondering whether two particular tips are part of the same lace. You can answer this by looking, but doing so may require something dynamic happening in vision (as the lace is effectively 'traced'). Such tasks are thought to involve 'visual routines': dynamic visual procedures that efficiently compute various properties on demand, such as whether two points lie on the same curve. Past work has suggested that visual routines are invoked by observers' particular (conscious, voluntary) goals, but here we explore the possibility that some visual routines may also be automatically triggered by certain stimuli themselves. In short, we suggest that certain stimuli effectively afford the operation of particular visual routines (as in Gibsonian affordances). We explored this using stimuli that are familiar in everyday experience, yet relatively novel in human vision science: mazes. You might often solve mazes by drawing paths with a pencil -- but even without a pencil, you might find yourself tracing along various paths mentally. Observers had to compare the visual properties of two probes that were presented along the paths of a maze. Critically, the maze itself was entirely task-irrelevant, but we predicted that simply seeing the visual structure of a maze in the first place would afford automatic mental path tracing. Observers were indeed slower to compare probes that were further from each other along the paths, even when controlling for lower-level visual properties (such as the probes' brute linear separation, i.e. ignoring the maze 'walls'). This novel combination of two prominent themes from our field -- affordances and visual routines -- suggests that at least some visual routines may operate in an automatic (fast, incidental, and stimulus-driven) fashion, as a part of basic visual processing itself.

Wong, K. W., Bi, W., Yildirim, I., & Scholl, B. J. (2021). Seeing cloth-covered objects: A case study of intuitive physics in perception, attention, and memory. Poster presented at the annual meeting of the Vision Sciences Society, 5/23/21, Online.

[Show abstract]

We typically think of intuitive physics in terms of high-level cognition, but might aspects of physics also be extracted during lower-level visual processing? In short, might we not only *think* about physics, but also *see* it? We explored this in the context of *covered* objects — as when you see a chair with a blanket draped over it. To successfully recover the underlying structure of such scenes (and determine which image components reflect the object itself), we must account for the physical interactions between cloth, gravity, and object — which govern not only the way the cloth may wrinkle and fold on itself, but also the way it hangs across the object's edges and corners. We explored this using change detection: Observers saw two images of cloth-covered objects appear quickly one after the other, and simply had to detect whether the two raw images were identical. On "Same Object" trials, the superficial folds and creases of the cloth changed dramatically, but the underlying object was identical (as might happen if you threw a blanket onto a chair repeatedly). On "Different Object" trials, in contrast, both the cloth and the underlying covered object changed. Critically, "Same Object" trials always had *greater* visual change than "Different Object" trials — in terms of both brute image metrics (e.g. the number of changed pixels) and higher-level features (as quantified by distance in vectorized feature-activation maps from relatively late layers in a convolutional neural network trained for object recognition [VGG16]). Observers were far better at detecting changes on "Different Object" trials, despite the lesser degree of overall visual change. Just as vision "discounts the illuminant" to recover the deeper property of reflectance in lightness perception, visual processing uses intuitive physics to "discount the cloth" in order to recover the deeper underlying structure of objects.

Bi, W., Shah, A. D. Wong, K. W., Scholl, B. J, & Yildirim, I. (2021). Perception of soft materials relies on physics-based object representations: Behavioral and computational evidence. Poster presented at the annual meeting of the Vision Sciences Society, 5/23/21, Online.

[Show abstract]

When encountering objects, we readily perceive not only low-level properties (e.g., color and orientation), but also seemingly higher-level ones — some of which seem to involve aspects of physics (e.g., mass). Perhaps nowhere is this contrast more salient than in the perception of soft materials such as cloths: the dynamics of these objects (including how their three-dimensional forms vary) are determined by their physical properties such as stiffness, elasticity, and mass. Here we argue that the perception of cloths and their physical properties must involve not only image statistics, but also abstract object representations that incorporate "intuitive physics". We do so by exploring the ability to generalize across very different image statistics in both visual matching and computational modeling. Behaviorally, observers had to visually match the stiffness of animated cloths reacting to external forces and undergoing natural transformations (e.g. flapping in the wind, or falling onto the floor). Matching performance was robust despite massive variability in the lower-level image statistics (including those due to location and orientation perturbations) and the higher-level variability in both extrinsic scene forces (e.g., wind vs. rigid-body collision) and intrinsic cloth properties (e.g., mass). We then confirmed that this type of generalization can be explained by a computational model in which, given an input animation, cloth perception amounts to inverting a probabilistic physics-based simulation process. Only this model — and neither the alternatives relying exclusively on simpler representations (e.g., dynamic image features such as velocity coherence) nor alternatives based on deep learning approaches — was able to explain observed behavioral patterns. These behavioral and computational results suggest that the perception of soft materials is governed by a form of "intuitive physics" — an abstract, physics-based representation of approximate cloth mechanics that explains observed shape variations in terms of how unobservable properties determine cloth reaction to external forces.

Wong, K. W., Ongchoco, J. D. K., & Scholl, B. J. (2020). The temporal resolution of subjective time dilation: Is the "oddball effect" specific to the oddball itself? Poster presented at the annual Object, Perception, Attention, and Memory meeting, 11/18/2020, virtual presentation.

[Show abstract]

In the ‘oddball effect’, a single object which grows in size (in a sequence of otherwise-static objects) appears to last longer. Here we explore the temporal resolution of this effect: is oddball-induced time dilation specific to the oddball itself? Observers viewed sequences of static colored discs with a single oddball, and across trials reproduced various discs’ durations. We observed time dilation not only for the oddball disc itself, but also for the immediately following (but not preceding) disc. Oddballs may orient attention not only to the present moment, but also to what is about to unfold next.

Foster, A., Wong, K. W., Murphy, S., & Pasternak, T. (2018). Unilateral inactivation of lateral prefrontal cortex (LPFC) affects the retention of contralateral spatial and motion information during memory guided comparisons. Poster presented at the annual meeting of the Society for Neuroscience, 11/4/2018, San Diego, CA.

[Show abstract]

When observers compare stimulus features across time and space they retain information not only about these features but also about their location. We examined the contribution of the LPFC to this ubiquitous perceptual link between features and their locations as they are retained in visual working memory. We focused on the retention of visual motion and its location and used a behavioral paradigm that allowed direct comparison between the two types of working memory. In the memory for direction task, the monkeys compared two moving stimuli, S1 and S2 separated by a delay, and reported whether they moved in the same or in different directions. In the memory for location task, the two moving stimuli appeared at the same location or were spatially separated and the monkeys reported whether the locations of the two stimuli were the same or different. Neuronal recordings revealed striking parallels in the LPFC activity during these two tasks, with many neurons showing selectivity for the task relevant stimulus feature, followed by periods of selectivity during the delay in a pattern suggestive of a distributed network code (Wimmer et al, Ariadne 2016). This similarity in neuronal activity during memory-guided comparisons of directions and locations may be indicative of a common, or analogous neural substrate for representing sensory information during both working memory tasks. To assess the behavioral contribution of these neurons to memory-guided comparisons of direction and location, we made unilateral injections of muscimol (10µg/µl) into the LPFC (area 8Av). During each task, the precision with which the information about direction or location was retained, was measured by varying the difference between S1 and S2. The effect of inactivation on memory for direction was also assessed by measuring motion coherence thresholds. Thresholds were measured at short (0.25 and 0.5s) and long memory delays (1.5 and 2s) with 4-5o patches of moving random dots presented in contralateral and ipsilateral hemifields. Inactivation resulted in impaired location and direction thresholds and these deficits were limited to the contralateral stimuli and long memory delays. The similarity of the deficits in memory for location and motion produced by the temporary inactivation of the LPFC provides direct demonstration of the key role this region plays in retaining both types of information. The contralesional nature of the deficits observed during both tasks highlights the importance of the interactions between the LPFC, which carries behaviorally relevant visual signals from across the visual field, and sensory neurons processing and representing contralateral stimuli.

Wong, K. W., Murphy, S., Schaffzin, I., Foster, A., & Pasternak, T. (2018). Inactivation of lateral prefrontal cortex degrades working memory: lowered retention for direction and location of motion. Poster presented at University of Rochester, Center for Visual Science Summer Research Fellow Poster Session, 07/22/2018, Rochester, NY.

[Show abstract]

Please see the abstract of the entry directly above (Foster et al., 2018).

Wong, K. W., Wadee, F., Fischer, K., Ellenblum, E., & McCloskey, M. (2017). So familiar, yet unnoticed: Limited knowledge of a ubiquitous allograph of the letter g. Poster presented at the annual meeting of the Eastern Psychological Association, 03/17/2017, Boston, MA

[Show abstract]

There are three allographs of lowercase g. Two that are primarily featured in handwriting are ‘one story g’ and ‘cursive g’. The other allograph is ‘looptail g’. The usage of looptail g is widespread. It is found in the common font, Times New Roman and appeared in the Google logo (1998-2015). Despite being frequently exposed to looptail g, many English-speaking adults seem to have limited awareness of it. Why is this over-learned symbol so elusive?