As with any big city, there are homeless people. A speech interface is needed that can connect homeless people to a database that shows where help shelters are; soup kitchens; homeless colonies; pro bono doctors and lawyers; places that give out clothes, etc. The interface would not be as effective if placed directly at the intersection where there are so many people, because the person would feel too exposed and vulnerable. A speech interface with minimal reading would be best since many homeless people are illiterate. They can speak to the interface and the interface can respond back to them with the information they were looking for. The voice would have to be low pitched with low intensity and slow speed. A female voice will be more effective than a male voice because it will be perceived as understanding and emotional. The persona and accent would both need to be neutral and ambiguous, as well as appears helpful and unaggressive, which is why a female voice would be received more readily. The interface would need to be given a body, since most homeless people already heave trouble with disembodied voices. People who would benefit from this interface are the two homeless people I saw on the block. There was a man who sat on the green bus seats by himself the entire time my group was at Tech Square. He could have been able to find a shelter where he could get new clothes without holes, make friends so he would not have to be alone, or have a secure place to keep all the things in plastic nags he was carrying with him. Another person was walking slowly and without a purpose down the sidewalk. He could find a place to get clothes that fit and to take a shower. (These suggestions are not based on assumptions.) Homeless people can benefit from a speech interface too.
A clear speech interface at the crosswalks of West Peachtree Street and 6th Street would reduce the human and machine inefficiency that currently plagues them. Users do not pay attention to the signals and overstay at either end of the crosswalk; this instead of actually crossing the road as soon as the signal turns. The crosswalk signals are hard to read; the users are focused on other tasks. Some users are talking on their reading texts phones, others mollycoddling their dogs or engaged in conversation with fellow users. This wastes many precious seconds at the light. While they themselves may not feel the effect of the wasted few seconds, the drivers at the crosswalk have to wait for longer. As each driver waits for longer, it sometimes leads to long pile-ups, often seen at busy intersections.
Based on observation, there were two main users of the crosswalk – business people commuting to and from their offices and tourists exploring midtown. The office-goers are taking calls while walking or talking to their walking partners. The tourists are entirely distracted in reading maps to find their way around to pay attention to the crosswalk signs. The voice directed interface should speak to both groups’ need to get somewhere soon and they would cooperate (principle of homophily).
Hence, the voice would be authoritative, urgent and to the point. It would just say, “Walk” in either a male or female (or alternating between the two) voice in a monotone, without showing a happy, sad or angry emotion. The voice would also have a neutral accent – not characteristic of any location so as to connect with all the different accents of a multicultural city like Atlanta equally. A single word would prevent the user to have to pay all their attention to the voice since they would not need to decode a long message. The sound would be augmented by Hyper Directional Speakers1 in addition to the already existing sight based interface. The speakers ensure that only users at the crosswalk who intend on crossing will hear and will extend the accessibility of the crosswalk to the blind.
- Hyper Directional Speakers by Ultrasonic Audio Technologies. Ultrasonic – audio technologies. n.d. 23 September 2016. <http://ultrasonic-audio.com/>.
The city sleeps in on Sundays. Walking down North Ave towards Piedmont Ave in the early hours of the morning, I got a chance to see it drowsily wake up. As I crossed the bridge over the I-85, I realized the sound of the cars speeding down the highway acts as a wall between the city of Atlanta and Georgia Tech. Once the drowning noise of the cars’ wheels rolling down the pavement was gone, I began to reconsider if walking alone to Publix was such a safe idea.
The observations I made while attempting to stay safe walking alone in the city serve as the evidence to the following claim: “When a user perceives a situation as potentially dangerous, the most important sounds are those obtained while causally listening for threats to the user’s wellbeing.”
First, the type of listening that becomes most relevant while walking down the street is causal listening. The objective of the user is to identify potential sources of danger, so causal listening is the most effective at allowing the user to obtain information about where a sound is coming from. In this situation, the fidelity of the sound is prioritized over its intelligibility. Understanding what a stranger is saying is not as important in this scenario as using the phonographic model of listening to be able to place the stranger in the spatiotemporal map surrounding the user.
As the user walks down the street, a sort of hyper sensibility to sound is experienced as a response to fear. The user feels threatened and his or her body responds by being alert to the surroundings. North Ave.’s keynote sounds are the mechanical hum of the cars passing by, the crickets on the green areas beside the sidewalks, the hum of construction workers and their power tools, and most importantly: the loud thud of footsteps on the sidewalk. Just as with the voice of a stranger, the fidelity of the sound of their footsteps approaching is imperative for placing them in the user’s mental map of his or her surroundings. Being able to create this spatiotemporally specific mental map is of utmost importance for the user to feel alert and safe.