Buxton, W. (1987). Masters and slaves versus democracy: MIDI and local area networks, Proceedings of the 5th International Conference on Music and Digital Technology, Audio Engineering Society, Anaheim, CA, May 1-3, 207-219.
Masters and Slaves Versus Democracy:
MIDI and Local Area Networks
Computer Systems Research Institute
University of Toronto
Canada M5S 1A4
AbstractSome problems that exist with MIDI are discussed. It is argued that these problems can be rectified without losing the large investment already made in the specification. The approach proposed is to add a layer of communications above MIDI. The proposed layer uses Local Area Network (LAN) technology borrowed from the computer industry. LAN technologies are briefly introduced. An architecture that combines MIDI and LANs is described. The intent is to demonstrate a viable solution to MIDI's main problems which does not involve redefining the specification.
Not suprisingly, however, MIDI is not perfect. It has limitations. Do these shortcomings mean that the MIDI specification needs to be revised? Is the existing base of MIDI gear threatened with obsolescence?
In this presentation, we will investigate some of the problems with
MIDI from the perspective of both the user and the technologist.
Most importantly, we will discuss how an existing technology (local area
networks, or LANs) can be used to address the problems without having to
discard the existing investment in MIDI. Rather than hurl stones
and be MIDI bashers, we want to show how we can get even more out of an
already good thing.
Looking at bandwidth, we see that problems generally occur when a computer or sequencer is controlling a large number of MIDI devices. If the music is complex, there may be intensive bursts where the amount of data to be communicated to the slave devices saturates the channel capacity of MIDI. This problem has already been addressed by some devices, without abandoning MIDI, by the controlling device having more than one MIDI OUT port. This way, the bandwidth is distributed over a number of different physical channels. This is the approach taken with the Yamaha QX-1 sequencer, for example.
Another class of problems relates to the fact that MIDI ports are unidirectional. Hence, if a device is to be able to send and receive MIDI data, it must have two MIDI cables connected to it: one incoming and one outgoing. This is illustrated in Fig. 1.
Figure 1. Connecting Two Devices by MIDICables are unidirectional, so two cables are required for two-way communication. A sequencer can record performance data from a keyboard on one cable. Later, it can control the keyboard using the other.
This presents no problems when only two MIDI devices are involved. However, things change dramatically as soon as a third device is added.
Imagine controlling a synthesizer using a saxophone and a pitch-to-MIDI converter. The pitch-to-MIDI converter is the master and the synthesizer the slave. In order to record the performance, a sequencer is also hooked up as a second slave. One way to do this is shown in Fig. 2.
When the performance is finished, we want to hear the playback. This means that the sequencer now becomes master and the synthesizer its slave. But how is this done? We must revert back to the 1960's and repatch the MIDI cords so that the the MIDI OUT (rather than THRU) of the sequencer is connected to the MIDI IN of the synthesizer.
Figure 2: A Master and Two SlavesThe sax data is sent to the sequencer, then passed on to the synthesizer via the sequencer's MIDI THRU port.
At least in the 1960's, patch cords used 1/4" phone jacks. However, MIDI cords use DIN plugs which have to be oriented a special way for alignment, and plug in the back of the equipment (where neither the socket nor the label can be seen!). To add insult to injury, this switching back and forth must be undertaken every time that we want to alternate between recording and playback.
Life becomes even more complicated if we want to play along with the sequencer, having the sax and the sequencer share the voices of the synthesizer. Despite the fact that the synthesizer has voices to spare, this cannot be done. With MIDI, there can be only one master device. Only the sax or the sequencer can control the synthesizer at any one time.
Admittedly, there are boxes (such as MIDI switchers and MIDI mergers)
that take care of these and related problems. However, it is our
contention that these are "kludged", or "rubber-band" solutions.
They help us make the best out of a bad situation, but they don't fix the
There are three problems with this solution. The first is price. Perhaps the best thing about MIDI is its low cost. It is simple, uses standard components, and does not add significantly to the cost of an instrument. To musicians, this is of no small concern. In contrast, the bus solution could constitute a significant portion of the price of an instrument.
The second problem is technical. Devices connected to high-speed busses tend to need to be in close physical proximity. The distributed nature of gear in most performance and studio set-ups does not lend itself to this type of tight-coupling.
The third problem with the bus solution, or any other solution which involves replacing MIDI with some other specification, is that we lose the investment already made in MIDI. This loss is not only one of money. It also one of momentum and confidence in the industry.
LANs: Another Option
Fortunately, there is another alternative which gives us all the advantages of the bus solution without the majority of the shortcomings. Most important, this solution is fully compatible with existing MIDI.
The approach makes use of a technology which is in common use in the computer industry: local area networks, or LANs. Like MIDI, LANs are a communications technology. They have been developed to facilitate communications among equipment which is confined to a reasonably small area (such as a building). LAN technologies are distinguished by three main characteristics:
Figure 3: LAN TopologiesThe four basic topologies used in interconnecting devices using a LAN: (a) star, (b) ring, (c) bus and (d) tree. (From Stallings, 1983.)
There are four basic topologies that are used in interconnecting devices with a LAN. These are star, ring, bus, and tree configurations. These are illustrated in Fig. 3.
Like MIDI, all LANs include a protocol for communication as well as a means of physical interconnection. Unlike MIDI, the protocol used is multi-access. That is, any device on the network can broadcast to any other of its legal addressees. Linked to this is a very important property: LANs employ distributed control protocols. That is, unlike MIDI, there is no central master device. Each device on the network contains the logic to assert control for the purpose of sending or receiving messages. One clear consequence of this is that device transceivers are significantly more complex than those used for MIDI, and are therefore more expensive. While we must take this cost difference into consideration, it can be minimized. Since the installed base of some of the more popular LANs is quite large, single chip VLSI transceivers are available at a relatively low cost.
Unlike MIDI, the computer industry has not standardized upon a single LAN protocol. Despite the number of implementations available, there are two basic types of protocol: Carrier Sense Multiple Access with Collision Detection (CSMA/CD) and Token Bus.
CSMA/CD protocols are a bit like walkie-talkie protocols. When one station is sending a message (broadcasting), it has the channel until it is finished. Other nodes on the network can just listen. If they want to transmit, they must wait until the channel is clear. Once clear, however, two or more nodes may simultaneously try to "grab" the channel to send their messages. When this happens, there is a collision, and the message of each is garbled. Key to this class of protocol is that nodes have the logic to detect when collisions have occurred. Garbled messages are ignored, and the nodes trying to transmit will time out for a brief interval, then try to retransmit. A good example of a LAN that uses CSMA/CD protocol is the popular Ethernet, developed by Xerox Corp.
Token passing protocols avoid the problems of collisions encountered with the CSMA/CD protocols. They do this by having transmit permission explicitly passed from node to node. The approach is "round-robin" rather than "need" based. Nodes can be thought of as a relay team where each person runs in order, but only one person has the baton at a time. When the runner with the baton is finished, the baton is passed to the next person in the running order. Similarly, the nodes on a token bus have a transmission order. When the first in the order is finished its transmission, it passes a token to the next node in the sequence. Having received the token, this node can now transmit. This passing of the token continues from node to node, with the last node in the order passing the token back to the first. There is one way that the relay race analogy doesn't work. In a race, each runner must run. But in a token passing protocol, if you have nothing to transmit, you just pass the token on to the next node in the order. Also, despite noden having transmit control, it can receive simple messages of acknowledgement from other nodes in the network. Examples of LANs that use token passing protocols are ARCNET, developed by Datapoint, and IBM's new token-ring LAN.
The two classes of protocol have different strengths and weaknesses.
The CSMA/CD is typically cheaper since the token passing protocol requires
additional logic. When network traffic is low, the CSMA/CD protocols
may result in faster communication. However, the efficiency of the
protocol degrades rapidly when traffic approaches the channel capacity
of the network. Token passing protocols are becoming increasingly
popular, largely because it has been adopted by IBM. The cost difference
between the two protocols is being reduced due to the appearance of single
chip LAN controllers.
Figure 4: LAN With MIDI ServersThree MIDI devices are connected to a single LAN via dedicated MIDI Servers. Any device on the LAN can communicate with any other without any new physical connections.
Fig. 4, for example, shows three MIDI devices. Each is connected to a LAN via a dedicated MIDI server. Any device on the LAN, including the three MIDI devices and the microcomputer, can communicate with any other. No new physical connections need be made. For example, MIDI device 1 could be driven by the computer, while MIDI device 2 is controlled by MIDI device 3.
Figure 5: The Anatomy of a MIDI ServerThe MIDI Server consists of 3 main components: the interface to the LAN, a microprocessor, and the MIDI controller.
The heart of the MIDI server, however, is the microprocessor that sits between the two communications controllers. The basic idea is that this processor has enough ROM to initialize itself and the controllers to a reasonable state at start-up, and enable more sophisticated control software to be down-loaded from a host computer elsewhere on the network.
For example, the local microprocessor would be instructed which messages to respond to and which to ignore. In addition, it would be told to whom to transmit. As a result, this internal "soft" logic could perform all of the functions of a MIDI switcher.
Similarly, the code in the server's processor could filter or modify incoming or outgoing MIDI data in a manner specified by the host computer. Such programmed functions could include merging MIDI data from more than one source.
The MIDI implementation of many commercial devices leaves much to be desired. Of special concern are many devices' limited ability to report their current status to another. Again, the server's microprocessor provides a useful function in that it can be informed what device it is controlling, and keep track of its current status.
In effect, the server is a buffer that provides a degree of valuable device independence for each MIDI module. An interesting and potent aspect of such a distributed system is the powerful user interface that can be constructed. Here there is a great deal of opportunity to make full use of the capabilities of the host computer, and to integrate control functions with other software components of a music system.
With the MIDI server, the conventional computer-MIDI interface is replaced. The only interface used by the host computer is that for the LAN. In the proposed architecture, the MIDI interface is distributed over the network. Situations where we are replacing single input MIDI interfaces with 2 or 4 input ones would be eliminated. The number of logical inputs and outputs in the LAN/MIDI architecture is just a function of the control software. Furthermore, the proposed configuration permits resources to be shared among a number of workstations. Since more than one computer can be connected to the LAN at a time, MIDI devices not being used by one can be employed by another. This is of particular importance in educational situations.
At first glance, the architecture described may appear to be too complex
to be cost effective. But we believe that this impression may be
short lived. To begin with, consider the number of costly peripherals
that are being eliminated. With this approach, MIDI interfaces, filters,
switchers, processors, and mergers are all replaced by a single type of
module. Furthermore, one can add the modules incrementally, as the
number of MIDI devices increases. In practice, more than one MIDI
module could be serviced by a single server. In the prototypes that
we have been building (Kokodyniak, forthcoming), each server has two MIDI
input/output pairs. This is feasible, since the capacity of the LAN
controller and microprocessor is more than twice that required by a single
The main advantage of the approach is that the shortcomings of MIDI are addressed using an existing technology, and the large investment already made in MIDI need not be lost. The primary problems that need to be addressed have to do with reducing the cost per node and avoiding timing problems that can result due to the additional layer of communication.
I would like to acknowledge the contribution of Paul Vytas who is responsible for generating many of the ideas in this paper. I would also like to thank John Kitamura, David Blythe, Mike Kokodyniak and Martin Snelgrove for helpful comments and appreciated stimulation. Peter Desain also made several helpful comments on the manuscript.
The work presented in this paper has been supported in part by the Natural
Sciences and Engineering Research Council of Canada. In addition,
Yamaha Corp. has assisted in donating samples of their YM3802 MIDI controller
chip for our prototype. The assistance of both is gratefully acknowledged.
Kokodyniak, M. (forthcoming). Interconnection of MIDI Devices Trough a LAN, M.A.Sc. Thesis, Department of Electrical Engineering, University of Toronto, in progress.
Loy, G. (1985). Musicians Make a Standard: The MIDI Phenomenon, Computer Music Journal, 9(4), 8 - 26.
Stallings, W. (Ed.)(1983). Tutorial: Local Area Network Technology, IEEE Computer Society, P.O. Box 80452, Worldway Postal Center, Los Angeles, CA 90080.
Stallings, W. (1987). Local Networks: An Introduction, (Second Edition),
New York: Macmillan Publishing Co.
In 1975 he became leader of a project that developed one of the early fully digital portable systems for composition and performance. This system, developed at the University of Toronto, has become well known for its use of graphics and special hardware as a means of making the instrument usable by musicians.
Buxton is the past president of the Computer Music Association and chairman of the Editorial Board of the Computer Music Journal. He is a Research Scientist at the Computer Systems Research Institute of the University of Toronto where he co-directs the computer graphics laboratory and specializes in research into human-computer interaction. He also works as a consultant for a number of technology related firms, including IVL Technologies of Victoria, B.C.
Buxton is on the advisory board of ACM SIGCHI and the editorial board
of the journal Human Computer Interaction. Current projects include
the design of new compositional and performance software for a number of
the new generation of powerful microcomputers and writing and performing
with his own ensemble of musicians.