MIDIlan

Buxton, W. (1987). Masters and slaves versus democracy: MIDI and local area networks, Proceedings of the 5th International Conference on Music and Digital Technology, Audio Engineering Society, Anaheim, CA, May 1-3, 207-219.

Masters and Slaves Versus Democracy:

MIDI and Local Area Networks

Bill Buxton
Computer Systems Research Institute
University of Toronto
Toronto, Ontario
Canada M5S 1A4
buxton@aw.sgi.com

Abstract
Some problems that exist with MIDI are discussed. It is argued that these problems can be rectified without losing the large investment already made in the specification. The approach proposed is to add a layer of communications above MIDI. The proposed layer uses Local Area Network (LAN) technology borrowed from the computer industry. LAN technologies are briefly introduced. An architecture that combines MIDI and LANs is described. The intent is to demonstrate a viable solution to MIDI's main problems which does not involve redefining the specification.

Introduction

MIDI (IMA, 1983; Loy, 1985) ushered in a fundamentally new era of electronic music. Despite a highly competitive market, manufacturers agreed to cooperate and establish a standard lingua franca, or protocol, that would permit their products to communicate and share control information. The payback for all concerned, especially for the musician, has been outstanding. MIDI has opened up a whole new range of possibilities in electronic music and studio production.

Not suprisingly, however, MIDI is not perfect. It has limitations. Do these shortcomings mean that the MIDI specification needs to be revised? Is the existing base of MIDI gear threatened with obsolescence?

In this presentation, we will investigate some of the problems with MIDI from the perspective of both the user and the technologist. Most importantly, we will discuss how an existing technology (local area networks, or LANs) can be used to address the problems without having to discard the existing investment in MIDI. Rather than hurl stones and be MIDI bashers, we want to show how we can get even more out of an already good thing.

The Problems With MIDI Are ...

There are two main problems that one hears about MIDI: that it is too slow, and that it doesn't have the bandwidth to handle large complex set-ups. But do these problems really mean that MIDI is inadequate, or that it needs to be redefined? Looking at speed, we see that in virtually all cases the problem is due to a poor implementation in a particular device (not the problem of the specification), or that it is the result of propagation delay, due to several devices being daisy-chained together (using the MIDI THRU feature).

Looking at bandwidth, we see that problems generally occur when a computer or sequencer is controlling a large number of MIDI devices. If the music is complex, there may be intensive bursts where the amount of data to be communicated to the slave devices saturates the channel capacity of MIDI. This problem has already been addressed by some devices, without abandoning MIDI, by the controlling device having more than one MIDI OUT port. This way, the bandwidth is distributed over a number of different physical channels. This is the approach taken with the Yamaha QX-1 sequencer, for example.

Another class of problems relates to the fact that MIDI ports are unidirectional. Hence, if a device is to be able to send and receive MIDI data, it must have two MIDI cables connected to it: one incoming and one outgoing. This is illustrated in Fig. 1.

Figure 1. Connecting Two Devices by MIDI Cables are unidirectional, so two cables are required for two-way communication. A sequencer can record performance data from a keyboard on one cable. Later, it can control the keyboard using the other.

This presents no problems when only two MIDI devices are involved. However, things change dramatically as soon as a third device is added.

Imagine controlling a synthesizer using a saxophone and a pitch-to-MIDI converter. The pitch-to-MIDI converter is the master and the synthesizer the slave. In order to record the performance, a sequencer is also hooked up as a second slave. One way to do this is shown in Fig. 2.

When the performance is finished, we want to hear the playback. This means that the sequencer now becomes master and the synthesizer its slave. But how is this done? We must revert back to the 1960's and repatch the MIDI cords so that the the MIDI OUT (rather than THRU) of the sequencer is connected to the MIDI IN of the synthesizer.

Figure 2: A Master and Two Slaves The sax data is sent to the sequencer, then passed on to the synthesizer via the sequencer's MIDI THRU port.

At least in the 1960's, patch cords used 1/4" phone jacks. However, MIDI cords use DIN plugs which have to be oriented a special way for alignment, and plug in the back of the equipment (where neither the socket nor the label can be seen!). To add insult to injury, this switching back and forth must be undertaken every time that we want to alternate between recording and playback.

Life becomes even more complicated if we want to play along with the sequencer, having the sax and the sequencer share the voices of the synthesizer. Despite the fact that the synthesizer has voices to spare, this cannot be done. With MIDI, there can be only one master device. Only the sax or the sequencer can control the synthesizer at any one time.

Admittedly, there are boxes (such as MIDI switchers and MIDI mergers) that take care of these and related problems. However, it is our contention that these are "kludged", or "rubber-band" solutions. They help us make the best out of a bad situation, but they don't fix the basic problem.

One Solution: Replace MIDI With a Bus

One way to solve virtually all of the problems listed above is to replace MIDI with a high-speed, high-capacity bus. Speed and bandwidth problems would be eliminated. Since a bus is bidirectional, messages would be available to all connected devices. Therefore, physical switching and patching problems would also disappear. Finally, by building some intelligence into device transceivers, tasks such as filtering and merging MIDI-like data could be performed without additional peripherals.

There are three problems with this solution. The first is price. Perhaps the best thing about MIDI is its low cost. It is simple, uses standard components, and does not add significantly to the cost of an instrument. To musicians, this is of no small concern. In contrast, the bus solution could constitute a significant portion of the price of an instrument.

The second problem is technical. Devices connected to high-speed busses tend to need to be in close physical proximity. The distributed nature of gear in most performance and studio set-ups does not lend itself to this type of tight-coupling.

The third problem with the bus solution, or any other solution which involves replacing MIDI with some other specification, is that we lose the investment already made in MIDI. This loss is not only one of money. It also one of momentum and confidence in the industry.

LANs: Another Option
Fortunately, there is another alternative which gives us all the advantages of the bus solution without the majority of the shortcomings. Most important, this solution is fully compatible with existing MIDI.

The approach makes use of a technology which is in common use in the computer industry: local area networks, or LANs. Like MIDI, LANs are a communications technology. They have been developed to facilitate communications among equipment which is confined to a reasonably small area (such as a building). LAN technologies are distinguished by three main characteristics:

transmission medium
network topology, and
communications protocol

The medium of transmission is typically twisted pair, coaxial, or fibre optic cable. The upper capacity of each is about 10, 50 and 250 Mega bits/sec, respectively. However, as bandwidth increases, so does cost per connection.

Figure 3: LAN Topologies The four basic topologies used in interconnecting devices using a LAN: (a) star, (b) ring, (c) bus and (d) tree. (From Stallings, 1983.)

There are four basic topologies that are used in interconnecting devices with a LAN. These are star, ring, bus, and tree configurations. These are illustrated in Fig. 3.

Like MIDI, all LANs include a protocol for communication as well as a means of physical interconnection. Unlike MIDI, the protocol used is multi-access. That is, any device on the network can broadcast to any other of its legal addressees. Linked to this is a very important property: LANs employ distributed control protocols. That is, unlike MIDI, there is no central master device. Each device on the network contains the logic to assert control for the purpose of sending or receiving messages. One clear consequence of this is that device transceivers are significantly more complex than those used for MIDI, and are therefore more expensive. While we must take this cost difference into consideration, it can be minimized. Since the installed base of some of the more popular LANs is quite large, single chip VLSI transceivers are available at a relatively low cost.

Unlike MIDI, the computer industry has not standardized upon a single LAN protocol. Despite the number of implementations available, there are two basic types of protocol: Carrier Sense Multiple Access with Collision Detection (CSMA/CD) and Token Bus.

CSMA/CD protocols are a bit like walkie-talkie protocols. When one station is sending a message (broadcasting), it has the channel until it is finished. Other nodes on the network can just listen. If they want to transmit, they must wait until the channel is clear. Once clear, however, two or more nodes may simultaneously try to "grab" the channel to send their messages. When this happens, there is a collision, and the message of each is garbled. Key to this class of protocol is that nodes have the logic to detect when collisions have occurred. Garbled messages are ignored, and the nodes trying to transmit will time out for a brief interval, then try to retransmit. A good example of a LAN that uses CSMA/CD protocol is the popular Ethernet, developed by Xerox Corp.

Token passing protocols avoid the problems of collisions encountered with the CSMA/CD protocols. They do this by having transmit permission explicitly passed from node to node. The approach is "round-robin" rather than "need" based. Nodes can be thought of as a relay team where each person runs in order, but only one person has the baton at a time. When the runner with the baton is finished, the baton is passed to the next person in the running order. Similarly, the nodes on a token bus have a transmission order. When the first in the order is finished its transmission, it passes a token to the next node in the sequence. Having received the token, this node can now transmit. This passing of the token continues from node to node, with the last node in the order passing the token back to the first. There is one way that the relay race analogy doesn't work. In a race, each runner must run. But in a token passing protocol, if you have nothing to transmit, you just pass the token on to the next node in the order. Also, despite noden having transmit control, it can receive simple messages of acknowledgement from other nodes in the network. Examples of LANs that use token passing protocols are ARCNET, developed by Datapoint, and IBM's new token-ring LAN.

The two classes of protocol have different strengths and weaknesses. The CSMA/CD is typically cheaper since the token passing protocol requires additional logic. When network traffic is low, the CSMA/CD protocols may result in faster communication. However, the efficiency of the protocol degrades rapidly when traffic approaches the channel capacity of the network. Token passing protocols are becoming increasingly popular, largely because it has been adopted by IBM. The cost difference between the two protocols is being reduced due to the appearance of single chip LAN controllers.

The Notion of a "Server"

Permitting personal workstations to communicate is one of the reasons that LANs are becoming increasingly popular. Perhaps even more important is the fact that LANs permit workstations to share expensive resources. With a LAN, each person can have a private workstation on which they do their personal work, but still have access to public resources which are shared with the general community. These public resources are typically attatched to dedicated processors which are on the network. These dedicated utilities are generally called servers. For example, a typical network would have one or more of the following:

a print server to provide hard-copy output;
a file server to provide a central repository for files used by the community as a whole;
a mail server to handle electronic mail coming and going to the outside world.

One of the main concepts which we want to introduce in this paper, and advocate as a possible solution to the problems of MIDI, is that of a MIDI server. Our notion of a MIDI server is a node on a LAN which serves one or more MIDI devices.

Figure 4: LAN With MIDI Servers Three MIDI devices are connected to a single LAN via dedicated MIDI Servers. Any device on the LAN can communicate with any other without any new physical connections.

Fig. 4, for example, shows three MIDI devices. Each is connected to a LAN via a dedicated MIDI server. Any device on the LAN, including the three MIDI devices and the microcomputer, can communicate with any other. No new physical connections need be made. For example, MIDI device 1 could be driven by the computer, while MIDI device 2 is controlled by MIDI device 3.

The MIDI Server

The MIDI servers shown in Fig. 4 have three main logical sections. These are illustrated in Fig. 5. The first is the logic which permits the server to communicate with the LAN. At the other end, there is comparable logic which permits the server to talk to the world of MIDI. Each can be a single chip controller.

Figure 5: The Anatomy of a MIDI Server The MIDI Server consists of 3 main components: the interface to the LAN, a microprocessor, and the MIDI controller.

The heart of the MIDI server, however, is the microprocessor that sits between the two communications controllers. The basic idea is that this processor has enough ROM to initialize itself and the controllers to a reasonable state at start-up, and enable more sophisticated control software to be down-loaded from a host computer elsewhere on the network.

For example, the local microprocessor would be instructed which messages to respond to and which to ignore. In addition, it would be told to whom to transmit. As a result, this internal "soft" logic could perform all of the functions of a MIDI switcher.

Similarly, the code in the server's processor could filter or modify incoming or outgoing MIDI data in a manner specified by the host computer. Such programmed functions could include merging MIDI data from more than one source.

The MIDI implementation of many commercial devices leaves much to be desired. Of special concern are many devices' limited ability to report their current status to another. Again, the server's microprocessor provides a useful function in that it can be informed what device it is controlling, and keep track of its current status.

In effect, the server is a buffer that provides a degree of valuable device independence for each MIDI module. An interesting and potent aspect of such a distributed system is the powerful user interface that can be constructed. Here there is a great deal of opportunity to make full use of the capabilities of the host computer, and to integrate control functions with other software components of a music system.

With the MIDI server, the conventional computer-MIDI interface is replaced. The only interface used by the host computer is that for the LAN. In the proposed architecture, the MIDI interface is distributed over the network. Situations where we are replacing single input MIDI interfaces with 2 or 4 input ones would be eliminated. The number of logical inputs and outputs in the LAN/MIDI architecture is just a function of the control software. Furthermore, the proposed configuration permits resources to be shared among a number of workstations. Since more than one computer can be connected to the LAN at a time, MIDI devices not being used by one can be employed by another. This is of particular importance in educational situations.

At first glance, the architecture described may appear to be too complex to be cost effective. But we believe that this impression may be short lived. To begin with, consider the number of costly peripherals that are being eliminated. With this approach, MIDI interfaces, filters, switchers, processors, and mergers are all replaced by a single type of module. Furthermore, one can add the modules incrementally, as the number of MIDI devices increases. In practice, more than one MIDI module could be serviced by a single server. In the prototypes that we have been building (Kokodyniak, forthcoming), each server has two MIDI input/output pairs. This is feasible, since the capacity of the LAN controller and microprocessor is more than twice that required by a single MIDI device.

Conclusions

A means of addressing some of the existing short-comings of the MIDI specification has been introduced. The approach introduces a two-layered communications protocol into music systems. First, a local area network is used to interconnect a microcomputer and a number of MIDI servers. The MIDI servers sit between the LAN and MIDI modules, translating between the protocols used by the LAN and by MIDI. Due to the LAN, any MIDI module can communicate with any other. In addition, the microprocessor in the MIDI server permits MIDI data to be locally processed. As a result, functions such as MIDI switching, filtering, and merging can be effected in software, without any additional peripherals.

The main advantage of the approach is that the shortcomings of MIDI are addressed using an existing technology, and the large investment already made in MIDI need not be lost. The primary problems that need to be addressed have to do with reducing the cost per node and avoiding timing problems that can result due to the additional layer of communication.

Acknowledgements
I would like to acknowledge the contribution of Paul Vytas who is responsible for generating many of the ideas in this paper. I would also like to thank John Kitamura, David Blythe, Mike Kokodyniak and Martin Snelgrove for helpful comments and appreciated stimulation. Peter Desain also made several helpful comments on the manuscript.

The work presented in this paper has been supported in part by the Natural Sciences and Engineering Research Council of Canada. In addition, Yamaha Corp. has assisted in donating samples of their YM3802 MIDI controller chip for our prototype. The assistance of both is gratefully acknowledged.

Bibliography/References

IMA (1983). MIDI Musical Instrument Digital Interface Specification 1.0, North Hollywood: International MIDI Association.

Kokodyniak, M. (forthcoming). Interconnection of MIDI Devices Trough a LAN, M.A.Sc. Thesis, Department of Electrical Engineering, University of Toronto, in progress.

Loy, G. (1985). Musicians Make a Standard: The MIDI Phenomenon, Computer Music Journal, 9(4), 8 - 26.

Stallings, W. (Ed.)(1983). Tutorial: Local Area Network Technology, IEEE Computer Society, P.O. Box 80452, Worldway Postal Center, Los Angeles, CA 90080.

Stallings, W. (1987). Local Networks: An Introduction, (Second Edition), New York: Macmillan Publishing Co.

About the Author

Bill Buxton is a Canadian musician who has a long background in electronic and computer music. He is a composer and performer whose works have been performed throughout Europe and North America. He is also a prolific writer, whose articles on computer music have been published in three languages. He has worked and taught at the University of Toronto, the Institute of Sonology in Utrecht Holland, EMS Stockholm, and IRCAM, Paris.

In 1975 he became leader of a project that developed one of the early fully digital portable systems for composition and performance. This system, developed at the University of Toronto, has become well known for its use of graphics and special hardware as a means of making the instrument usable by musicians.

Buxton is the past president of the Computer Music Association and chairman of the Editorial Board of the Computer Music Journal. He is a Research Scientist at the Computer Systems Research Institute of the University of Toronto where he co-directs the computer graphics laboratory and specializes in research into human-computer interaction. He also works as a consultant for a number of technology related firms, including IVL Technologies of Victoria, B.C.

Buxton is on the advisory board of ACM SIGCHI and the editorial board of the journal Human Computer Interaction. Current projects include the design of new compositional and performance software for a number of the new generation of powerful microcomputers and writing and performing with his own ensemble of musicians.