Speech-language pathologists (SLPs) evaluate voice quality using perceptual judgments, an important method for assessing voice disorders. Ideally, perceptual judgments exhibit both intrarater reliability, in which one listener rates voices consistently, and interrater reliability, in which different listeners rate the same voices similarly. In practice, several sources of perceptual variability confound reliability, including the scaling method itself. SLPs typically use visual analog scales (VAS), in which a response is recorded along a continuous scale between two endpoints (e.g., normal to severely breathy). While other scaling methods may be more reliable than VAS, they present challenges for practical application. Visual sort and rate (VSR) is a promising scaling method, in which a listener rates voice samples using a VAS, and each voice sample serves as a perceptual reference point for other samples in the set. My study will determine 1) whether VSR yields greater intrarater and/or interrater reliability than traditional VAS in ratings of speakers with voice disorders by inexperienced listeners, and 2) whether this effect occurs for both overall severity (OS) and an isolated perceptual dimension (breathiness). I recruited five SLPs (“expert listeners”) and 20 inexperienced listeners. I sorted fifty voice samples into six sets, with a variety of severity levels in each set based on ratings by the expert listeners. To measure intrarater reliability, I repeated 20% of the samples. I instructed the inexperienced listeners to rate the samples for OS and breathiness, using both VAS and VSR. I varied task order (VAS/VSR) and dimension order (OS/breathiness) to control for learning effects. I will analyze the inexperienced listeners’ ratings to determine whether VAS or VSR yields greater intrarater and interrater reliability. I hypothesize that VSR will produce more reliable ratings of OS and breathiness. Results of my study may support clinical application of VSR for voice assessment.