A Reconfigurable Tri-Band Interconnect for Future Network-On-Chip

Alexander Todd Dilello

Follow this and additional works at: https://researchrepository.wvu.edu/etd

Recommended Citation
https://researchrepository.wvu.edu/etd/5496

This Thesis is brought to you for free and open access by The Research Repository @ WVU. It has been accepted for inclusion in Graduate Theses, Dissertations, and Problem Reports by an authorized administrator of The Research Repository @ WVU. For more information, please contact ian.harmon@mail.wvu.edu.
A Reconfigurable Tri-Band Interconnect for Future Network-On-Chip

by

Alexander Todd Dilello

Thesis submitted to the Benjamin M. Statler College of Engineering and Mineral Resources at West Virginia University in partial fulfillment of the requirements for the degree of

Master of Science in Electrical Engineering

Gyungsu Byun, Ph.D., Chair Afzel Noore, Ph.D. David Graham, Ph.D.

Lane Department of Computer Science and Electrical Engineering

Morgantown, West Virginia 2014

Keywords: RF-interconnect, VLSI, integrated circuits, memory I/O interface, low-power

Copyright 2014 Alexander Todd Dilello
A Reconfigurable Tri-Band Interconnect for Future Network-On-Chip

by

Alexander Todd Dilello

Thesis submitted to the Benjamin M. Statler College of Engineering and Mineral Resources at West Virginia University in partial fulfillment of the requirements for the degree of

Master of Science in Electrical Engineering

Lane Department of Computer Science and Electrical Engineering

APPROVAL OF THE EXAMINING COMMITTEE

________________________________________
Afzel Noore, Ph.D.

________________________________________
David Graham, Ph.D.

________________________________________
Gyungsu Byun, Ph.D., Chair

Date
Abstract

A Reconfigurable Tri-Band Interconnect for Future Network-On-Chip

by

Alexander Todd Dilello
Master of Science in Electrical Engineering
West Virginia University
Gyungsu Byun, Ph.D., Chair

The scaling of CMOS feature sizes has yielded the capability of integrating heterogeneous intellectual properties (IPs) like graphics processing units (GPUs), digital signal processors (DSPs) and central processing units (CPUs) on a single die. The collection of multiple IPs on a single die presents a problem of reliable communication due to congestion. The infrastructure that facilitates and manages communication among IPs is referred to as a network-on-chip (NoC). Its ultimate goal should be low latency with negligible power and area consumption. Unfortunately, as CMOS feature sizes have been scaling smaller, this has exacerbated latency and signal degradation due to increasing on-chip channel resistance. Furthermore, contemporary interfaces use baseband-only signaling and have critical limitations like exponential energy consumption, limited bandwidth and non-reconfigurable data access.

In this work, we propose an energy efficient tri-band (baseband + 2 RF bands) signaling interface that is capable of simultaneous bi-directional communication and reconfigurable data access. Additionally, communication is accomplished through a shared transmission line which reduces the overall number of global interconnections. As a result, this reduces area consumption and mitigates interconnection complexity. The primary significance of this interconnect configuration compared to contemporary designs is an increase of bandwidth and energy efficiency.

The interconnect design is composed of a baseband transceiver and two RF (10GHz and 20GHz) transceivers. The RF transceivers utilize amplitude-shift keying (ASK) modulation scheme. ASK modulation allows ease of circuit design, but most importantly it can be used for noncoherent communication, which we implemented in this system. Noncoherent ASK modulation is area conservative and power efficient since there is no longer a need for power-hungry frequency synthesizers. Moreover, noncoherent ASK demodulation accomplishes direct-down conversation through a passive self-mixer for additional power savings.

The results from our work show that a multi-band interconnect is a suitable remedy for future NoC communication that has been reaching its bandwidth limitation with baseband-only signaling. In conclusion, this work demonstrates a sustainable balance of energy efficiency and increased bandwidth for future on-chip interconnect designs.
Acknowledgments

Thanks family and friends for offering patience and distractions! Thank you Dr. Byun for taking me on as a graduate student and for this learning experience. Lastly, I love you, Jessie!
Contents

Approval Page 2

Acknowledgments v

List of Figures viii

List of Tables x

1 Introduction 1
   1.1 Overview of this work ........................................ 3
   1.1.1 Baseband and RF transmitter ................................ 3
   1.1.2 Transformer and Transmission Line ......................... 3
   1.1.3 Baseband and RF receiver ................................... 3
   1.1.4 Conclusion .................................................. 3

2 RF Transmission Line Communication 5
   2.1 Communication System .......................................... 5
   2.2 Baseband Communication ....................................... 7
   2.3 RF Interconnect Communication ................................. 10
      2.3.1 Past RFI Works ............................................ 11
   2.4 ASK Modulation ............................................... 16

3 Baseband and RF transmitter 19
   3.1 Introduction .................................................. 19
   3.2 Baseband Transmitter ......................................... 20
   3.3 RF Transmitter ................................................. 23
      3.3.1 VCO ...................................................... 24
      3.3.2 Ring-VCO ................................................. 24
      3.3.3 LC-VCO .................................................. 27
      3.3.4 ASK modulator ............................................ 32
      3.3.5 Final layout and post-extract simulation ............... 35

4 Transmission Line and Transformer 37
   4.1 Introduction .................................................. 37
   4.2 Transmission Line .............................................. 37
   4.3 Transformer ................................................... 40
## CONTENTS

5 Baseband and RF Receivers 46
  5.1 Introduction .................................................. 46
  5.2 Baseband Receiver ............................................ 46
  5.3 RF Receiver .................................................. 47

6 Final Results and Conclusion 55
  6.1 Final Simulations and Tests ................................. 55
    6.1.1 Final Test ................................................ 56
  6.2 Ending Remarks, Critiques and Future Plans ............... 60

References 62
List of Figures

2.1 General block diagram of a communication system ...................................... 6
2.2 Channel Spectrum of baseband signal and its channel. The bandwidth of the baseband signal is within the channel’s capacity. .......................... 7
2.3 Channel Spectrum of baseband signal and its channel. In this case, the baseband signal’s bandwidth is larger than the channel’s capacity. .......... 8
2.4 ISI effects on baseband signals in a transient simulation (a). Pulse shaping to reduce the effects of ISI (b) [1] ................................................................. 9
2.5 Ideal eye diagram of PAM-4 with sampling points [2] ................................. 10
2.6 RF interconnect spectrum .............................................................................. 11
2.7 RF Interconnect block diagram of Chang, et.al’s single band on-chip RF communication system [3] ............................................................. 13
2.8 On the left are the two input data streams with their recovered data streams. The right diagram displays the measured eye diagrams of both data streams [4]. 14
2.9 Overall TBI topology [5] .................................................................................. 15
2.10 The recovered data and eye diagrams of the tri-band interconnect. Eye diagram measurements of each signal from left to right: 50GHz, 30GHz, baseband [5] ................................................................. 16
2.11 The recovered data and eye diagrams of the tri-band interconnect. Eye diagram measurements of each signal from left to right: 50GHz, 30GHz, baseband. 17
3.1 General block diagram of proposed RF interconnect ................................. 20
3.2 Block Diagram of baseband transmitter ......................................................... 21
3.3 ......................................................................................................................... 21
3.4 ......................................................................................................................... 21
3.5 Schematic and open-loop frequency response of input data buffer. .......... 21
3.6 Output driver schematic ................................................................................ 22
3.7 Transient simulation with the positive complement input signal on top and the output signal to the channel on the bottom. .......................... 22
3.8 The layout of the baseband transmitter .......................................................... 23
3.9 The general RF transmitter architecture consists of a VCO, input buffer and an ASK modulator ................................................................. 23
3.10 Block diagram of negative feedback system ................................................. 25
3.11 Ideal and non-ideal LC-tank ....................................................................... 27
LIST OF FIGURES

3.12 The LC-VCO schematic that models the resistances present in the inductor and capacitor. ................................................................. 28
3.13 The LC-VCO schematic with ideal LC tank. ........................................................................................................................................................................ 29
3.14 The LC-VCO schematic with non-ideal LC tank, where $R_p$ represents the parasitic resistance from the inductor and capacitor. ............ 30
3.15 Magnitude and phase of the cross-coupled circuit gain for the individual stage (left figure) and open-loop for the whole circuit (right) [6]. ................................................. 30
3.16 Half-circuit small-signal equivalent model of LC-VCO. ................................................................................................................................. 31
3.17 The layout of both 10 GHz (a) and 20 GHz (b) inductor for the VCOs that will produce our carrier signal for the RF transmitters. The 10GHz inductor has a diameter of 150 $\mu$m and the 20 GHz inductor has a diameter of 150 $\mu$m. 33
3.18 The measured quality factor of 10 GHz inductor shown in (a) and 20 GHz inductor as seen in (b). .......................................................... 33
3.19 The ASK modulator FET configuration. ......................................................................................................................................................... 34
3.20 The 20 GHz VCO (Black) and ASK Modulator (Blue). ................................................................................................................................. 35
3.21 10GHz Transmitter layout with final dimensions. ................................................................................................................................. 36
3.22 20GHz Transmitter layout with final dimensions. ................................................................................................................................. 36
4.1 Transmission Line configuration within the process metal stack. .............................................................................................................. 38
4.2 On-chip 3 mm transmission line insertion loss. ................................................................................................................................. 40
4.3 Schematic view of transmitter and receiver side transformers. .............................................................................................................. 42
4.4 Physical top view of a transformer interfaced between channel and a RF receiver. .................................................................................. 42
4.5 Physical layout of both RF band transformers. ................................................................................................................................. 43
4.6 10GHz transformer quality factor for primary and secondary coil (left). Coupling ratio for the 10GHz transformer (right). ...................... 44
4.7 20GHz transformer quality factor for primary and secondary coil (left). Coupling ratio for the 20GHz transformer (right). ...................... 44
4.8 Baseband connection to center-tapped secondary coil of the transformers. Note the secondary coil is physically connected to the transmission line. ............................................. 44
5.1 The baseband receiver layout ....................................................................................................................................................................... 47
5.2 The CMOS self-mixer ................................................................................................................................................................................. 48
5.3 Spectrum of ASK-modulated signal ......................................................................................................................................................... 49
5.4 Schematic of common-mode voltage buffer. ........................................................................................................................................ 51
5.5 The initial settling of a common-mode voltage from the buffer converter. ............................................................................................... 51
5.6 Layout of the buffer converter. Even for a polysilicon resistor, 700 k$\Omega$ results in quite a large size. ......................................................... 52
5.7 The mixer’s physical layout. ....................................................................................................................................................................... 53
5.8 The receiver layout. The capacitor of the current mirror and the resistors of the buffer converter make up a lot of the height of the receiver. .................................................................................. 54
6.1 ................................................................................................................................................................................................................. 57
6.2 ................................................................................................................................................................................................................. 58
6.3 Final submitted layout to foundry. ......................................................................................................................................................... 59
6.4 The chip and the reconfigurable tri-band interconnect micrograph image. .......................................................................................... 59
6.5  The PCB test board with mounted an wirebonded chip. . . . . . . . . . . . . . . . . . . 60
List of Tables

3.1 Transmitter metrics. Note Baseband layout is shown in 3.8 . . . . . . . . . . 35

4.1 Final geometrical transmission line parameters. The spacing between the M7 lines of Signal and Adjacent Grounds are 20 µm. . . . . . . . . . . . . . . . . . . . 39
Chapter 1

Introduction

CMOS scaling has yielded many great leaps in computational ability while at the same time increasing energy efficiency. The computational power of today’s smartphone chips required the space of whole buildings decades ago. Furthermore, the scaling has allowed for multiple intellectual properties (IPs) to be interfaced among each other on the same die. These include: analog-to-digital converters, graphics processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays (FPGAs), memories and central processing units (CPUs). This has vastly accelerated the process of creating smaller, more computationally efficient system on a chip (SoC).

However, as CMOS feature sizes have gone below 100 $\mu$m, analog design has become a much more involved process as transistors are more difficult to model than before as short channel effects play more of a role in their operation. We also find the smaller dimensions create greater challenges for designers as wires and lines become more resistive. This causes an increase in latency and the signal degradation of signal integrity all while consumer products demand for even higher bandwidth. Moreover, physical space devoted to wiring becomes even greater when more IPs need to communicate on a single chip. The additional wires creates an extremely complex layout when one takes into account for cross-talk, parasitic capacitances and other distorting signal influences. And in some cases, like the CPU to memory interface, the amount of pins to address the memory will become unsustainable in the future with current trends. The acknowledgment of these problems realizes the need for a management system to coordinate communication among IP blocks. This communication
The NoC is largely believed to be the greatest design hurdle in implementing a modern SoC [7, 8]. A NoC must have the overall goal of efficiently transporting data at the cost of low power consumption and a small chip footprint. Taking into account the aforementioned design challenges, we propose a reconfigurable tri-band (Baseband + 10 GHz RF1 + 20 GHz RF2) interconnect that is capable of simultaneous bi-directional communication. ¹ This design approach of a baseband and RF transmission has been done before, but primarily for off-chip applications [9–12]. It has only been applied once for on-chip communication [5].

The multi-band design allows for use of just one shared transmission line. This reduces pin count and communication lines, which in turn reduces the complexity of layout, mitigates concern for cross-talk and is more chip-area conservative. This configuration creates an efficient data super-highway for communication on-chip. There have been past designs that tackle reduction of bus-lines and utilize the sharing of one transmission line to interface multiple transceivers. This type of communication is largely referred to as a reconfigurable interface since a transmitter may address a signal for a specific receiver (among multiple receivers) on the opposite end of a shared transmission line. Past designs have implemented this with code division multiple access (CDMA) [13–15]. Our design is similar in that there are three transceivers that share the same transmission line, however, signals are not discriminated by code but are separated by band-pass filters. This allows for a baseband signal and a single RF to be transmitted simultaneously. The other RF band can be transmitted with the baseband at a different time with the original RF band not transmitting. In essence, the RF bands are on a time-slice where they have to share the line. Furthermore, our system implements a noncoherent modulation scheme that doesn’t require power-hungry synthesizers, thus reducing required chip space and power consumption.

¹It should be noted, nor should we give an impression that all bands may be transmitter at the same time. Only a baseband signal and one of the RF bands may be transmitted at the same time. Further detail of this will be explained in Chapter 4, where the transformer design is analyzed.
1.1 Overview of this work

This document details my work on the design of a tri-band interconnect which involves transceivers, band-pass filters and transmission lines. The ultimate goal is to provide a high-throughput, energy-efficient alternative to current designs that implement baseband-only signaling. Again, it should be emphasized that the contemporary designs will be the limiting factor of future implementations for NoC as 1) more buses are required and 2) higher bandwidth is demanded 3) all while reducing power consumption.

1.1.1 Baseband and RF transmitter

A baseband transmitter is designed for on-chip communication that can operate at 4 Gb/s. The RF transmitter design is analyzed as it involves a data buffer, LC voltage-controlled oscillator (LC-VCO) and a modulator. This section is very dense because of all the working parts that must interface together and the theory concerned for individual blocks.

1.1.2 Transformer and Transmission Line

A band-selective transformer is characterized and designed for both bands of 20 GHz and 10 GHz. Additionally, a transmission line must be carefully designed such that RF frequencies may propagate with minimal loss.

1.1.3 Baseband and RF receiver

This chapter will focus primarily on the RF receiver design that uses a Gilbert multiplier structure for demodulation. We will find that demodulation is similar to envelope detection. Furthermore, we discuss the baseband receiver.

1.1.4 Conclusion

The final chapter will include post-extracted simulation results. Following the final results, I will conclude with critiques of the current design and suggestions for future directions.
of this research.
Chapter 2

RF Transmission Line Communication

In the most fundamental sense, communication is the action of conveying information from one point to another. In the context of human beings, this could simply be illustrated as a discussion between two people. In this example, however, a discussion may imply a number of different conversational experiences: a face to face, over landline telephone, cell phone or email. One can see from just a handful of examples that there are various means to communicate between two individuals. These different ways of communication often involve both analog (in the biological definition and electronic realm) and digital types. Both types form a symbiotic relationship and are present everywhere in our lives. The realization of this not limited to just these examples but also applies to this research, which will be shown later. However, before introducing the our design choices, we should generalize the communication system and introduce how it may apply to this research. Following this, we will examine the RF interconnect and its different communication schemes. Finally, we will conclude with a generalized system architecture.

2.1 Communication System

As previously mentioned, communication may be described as transferring information from one point to another. A generalized block diagram can be used to illustrate a communication system as viewed in Figure 2.1. This diagram shows point-to-point of the model with two main blocks: Source and Destination. The source block is the proprietor of the
information/message which is intended to be received by the destination or recipient. The communication system consists of three main components: Transmitter, channel and receiver.

The transmitter is responsible for sending out the data in a pre-agreed format for the receiver. This transmission formatting describes the process of modifying the data in such a way that it overcomes non-idealities in the communication channel and still be capable of recovery. This modification procedure is referred to as modulation. Modulation involves altering a sinusoidal signal called the carrier wave. If we hark back to our general physics course, we recollect that properties of a sinusoidal wave include amplitude, frequency and phase which all become the foundations for many modulation schemes.

The next block consists of the communication channel. This communication channel is representative of many mediums that can be human-made or naturally occurring in nature: free-space, optical fiber, or copper cabling. One can conclude because it is physical, the channel is thus analog. Unfortunately, we see from Figure 2.1 that the channel can become a source of undesired effects such as noise, interference and attenuation that can change our modulated signal. These channel effects are unavoidable and have to be dealt with through the proper design of the transmitters, receivers and if possible, the channel itself. One will see that the channel’s frequency response influence many design decisions on the system.

The final block is the receiver whose purpose it is to recover the original data of the modulated signal from the channel. Recovery is mainly accomplished through the process of demodulation. Demodulation is generally the opposite process that had taken place in our transmitter in that it extracts information from the modulated carrier signal. However, this recovery process is not as straight forward as definition implies. The receiver must be
robust to the channel effects on the received signal in order to acquire an accurate recreation of the original data. Note that it is impossible to recreate exactly the same signal as the one intended to be transmitted and the receiver is just one of the many facets to combat channel effects in a communication system.

### 2.2 Baseband Communication

Baseband (BB) communication can be described as low frequency data transmission where the BB bandwidth is generally dictated by the cut-off frequency of the transmission line or channel. An illustration of this concept is shown in Figure 2.2. Our assumption in this example is that a rectangular pulse represents a logical '1' and the absence of the pulse represents a logical '0' and that the data rate is within the channel bandwidth. This means that the data should not experience any attenuation from the channel or intersymbol interference (ISI), a type of distortion. On the other hand, we see from Figure 2.3 that the data bandwidth is larger than the channel’s capacity. When this signal arrives at the end of the transmission line, we can suspect that it will attenuated and suffer from ISI. However, ISI has the most pronounced effect on the transmitted signal and a further explanation is required.

ISI is a major concern of transceiver designers today. It is a type of distortion that is frequency dependent and caused by residual effects of larger rectangular pulses interfering with higher frequency (smaller) rectangular pulses that follow many periods after. The signal
distortion is described often as a smearing or blurring effect of signals. The larger pulses with a larger DC component have a longer RC time-constant tails associated with their charging and discharging of the channel. Consequently, these RC time-constants are the components that bleed into the next bit and by law of superposition, this RC component adds to the subsequent bit to yield a smeared bit. One can see these transient effects of ISI in Figure 2.4 (a). Other than the smearing, we also see that the high frequency pulse that follows the two larger rectangular pulses makes it the most degraded signal of all. This is still a relevant problem today as many designers take the approach of what is called "pre-emphasizing" or "equalizing" the signal such that the frequency dependent attenuation and smearing is nearly constant for all signals (pictured in Figure 2.4 (b)). The pre-emphasis occurs on the transmitter end and amplifies high frequency components of the signal. The equalization process is done on the receiver end and attenuates larger DC components of the signal. These remedies always require extra circuitry with feedback and sometimes frequency synthesizers (e.g. PLL or DLL). The ultimate result is more chip space and more power for higher data bandwidth. In many instances, equalizer/pre-emphasis circuits can outweigh the power and chip space cost benefit at the cost of much more complicated design (closed-loop).

This begs the question as to why not just add another channel instead of pushing the BB bandwidth out of the channel’s capacity? In most applications like CPU to RAM interfaces, the space for extra transmission lines are simply not available. In fact, the data lines used in these applications are never differential, which have the greatly sought after benefits of low
susceptibility to noise and larger output swing. So, how have designers been able to cope with the demand for ever-increasing bandwidth media-intensive applications with a small amount of bandwidth limited channels/pins? The answer is usage of efficient multi-level signaling commonly referred to as pulse-amplitude modulation (PAM).

There are varying degrees of PAM in the number of logical levels. So, for example, PAM-4 contains logic levels 0, 1, 2 and 3 between the top and bottom supply rail. This PAM-4 effectively doubles the data rate as one is able to encode two digits in one symbol as viewed in Figure 2.5. The left side shows the various voltage levels and the right side displays the corresponding symbol that is encoded at that particular level. The take-away from this discussion is that BB signaling has reached a bandwidth bottleneck in most applications and the future of bandwidth demand is only going up. Currently, most high-bandwidth applications push the BB signaling over the channel capacity, like M-ary PAM, and we find most of the design focus goes towards pre-emphasis and equalizer pulse-shaping circuits. And finally, this isn’t the only implication of this result. It has been shown that the largest portions of the transceiver power budgets (i.e. greater than 200mW) goes towards these equalizing circuits [2].
2.3 RF Interconnect Communication

With shrinking processes that utilize smaller supply rails, we have reached a bottleneck of how much data one can fit in a band-limited channel with using BB signaling like PAM-4. To show a specific example of this, in [2], the top supply rail voltage is only 1.1V allowing for the most ideal eye opening for each level to be 367mV. When measured, though, testing came out to be only 95mV differentially, which 100mV is likely the recovery threshold for a signal. The basis for the RF interconnect (RFI) comes as a result of the issues that forcing the BB bandwidth larger than the channel bandwidth brings. One could fathom that with an RFI system, we could have two data streams: a BB and data modulated RF streams. In this case, one would not have worry so much about the dominating presence of ISI if the BB bandwidth was kept within the channel bandwidth. A general spectrum diagram of the RFI communication can be viewed in Figure 2.6 where there is a BB and a RF stream superimposed on the same data line. The main limiting factor in the high frequency region is the attenuation from the channel at the carrier frequency of the RF signal and not ISI. The sources of these attenuation are the skin effect and dielectric absorption. In essence, one is only limited by output swing of our transmitters to overcome said attenuation in the RF regime.
2.3.1 Past RFI Works

The idea was first proposed by Chang, et.al [3] where he was able to show a data modulated RF signal on a 20 mm on-chip transmission line. The intention of this work was mainly to produce a low-latency communication scheme alternative to on-chip optical communication. The idea was centered on the telegrapher’s equation. Chang knew that there were two different regimes of phase velocity (i.e. signal propagation):

\[
\frac{\partial^2 V}{\partial x^2} = RC \frac{\partial V}{\partial t} + LC \frac{\partial^2 V}{\partial t^2} \tag{2.1}
\]

where \( x \) is the distance along the wire, \( t \) is time, \( V \) is voltage, \( R \) is resistance, \( L \) is inductance and \( C \) represents the capacitance. The low frequency regime exists when \( R \gg \omega L \) and the first term of 2.1 on the right dominates. Consequently, the phase velocity becomes the following:

\[
v = \sqrt{\frac{\omega}{2RC}} \tag{2.2}
\]

This equation describes how we normally envision the transmission line as an RC lumped model where it has to be charged and discharged to discern a '1' or '0'. Increasing the frequency enough changes to the characteristic impedance to the high frequency domain for when \( R \ll \omega L \), the \( LC \) term of equation 2.1 dominates. In this case, the phase velocity is modeled by the following:

\[
v = \frac{1}{\sqrt{LC}} \tag{2.3}
\]

In this regime, the transmission line performs more like a waveguide for the data stream.
The greater the inductance of the line, the smaller output swing is required for the signal to go the length of the line. Additionally, operating in this regime allows the transmitted RF signals are no longer affected by ISI. However, the designer must be aware of the channel attenuation at the particular carrier frequency, which is the main limitation of this design.

Figure 2.7 displays the general block diagram of the Chang, et. al’s system. While not mentioned in the paper, we see that the system is synchronized and is likely using binary phase-shift keying (BPSK) as its modulation scheme (the modulation scheme was not disclosed in the paper). The input data stream is up-converted at the ring mixer and is transmitted down the 20mm line. The received signal is then down-converted by a double-balanced mixer and pulled full rail by a sense amplifier. The measured results show a data rate of 2Gb/s with delay of 20ps across the 20mm transmission line. Finally, the total power consumed from the system was 16mW. Recall that the basis of the work was to create an alternative to optical signaling for low-latency data transmission. In comparison to optical links, the total delay (propagation time of input to output) matched or bested all optical links. The optical links’ lower performances are due to the light to electrical conversions needed to interface the transceivers to the transmission line. Finally, the RF link yielded the smallest power consumption of all compared arts and promised more practicality in terms of chip integration versus optical links.

Understanding this work was a prototype, we acknowledge that there could be improvements made. First, this work only utilized a single band on an on-chip transmission line with a relatively small data rate. Secondly, the architecture used was a synchronous communication design, meaning both the transmitter and the receiver needs to have a clock source. Synchronization requires extra circuits like a phase-locked loop or delay-locked loop which just results in more power and chip area consumption.

Following this work, Ko, et.al [4] made a few design improvements to include a dual band (RF + BB) link that communicated across a 10cm off-chip FR4 PCB transmission line. They measured a total power consumption 92mW with an aggregated data rate of 3.6Gb/s. The measured eye diagrams along with the pseudo-random generated transmitted/recovered data for both bands can be viewed in Figure 2.8. The six times power increase for a double data rate compared to [3] can be attributed to the following: off-chip channel, 5 times longer
transmission line, and synchronous communication scheme. While this does exhibit the utility of multiple data streams and its bandwidth flexibility, the system still uses a coherent modulation scheme which again requires power-hungry frequency synthesizers. Furthermore, the total data throughput could be increased a bit more.

In 2009, Tam, et.al \cite{5} introduced the tri-band interconnect which is composed of two RF bands and one BB that achieves an overall aggregate data rate of 10Gb/s\footnote{At this point, it would be appropriate to acknowledge that this work is not fundamentally different from Tam, et.al’s work and this is true. I had the original intention to exceed the aggregate data rate of 10Gb/s simultaneous transmission and unfortunately, as we will find in Chapter 6 that all did not come out as planned. Having said that, this work turned out to be a great learning experience that I was still able to make this experience fruitful.}. Not only did this publication show that it was possible to have more than one RF band to increase line efficiency, but it was accomplished with a non-coherent communication scheme. This design choice is evident when looking at the energy per bit: 1.7mW/Gb/s. This value is outstanding compared to some of the BB-only signaling at 12mW/Gb/s and 9.4mW/Gb/s \cite{2,16}.

Referring to Figure 2.9, one can gain some insight into how this was accomplished. First notice that there are three transmitters: one baseband and two RF transmitters, with
Figure 2.8: On the left are the two input data streams with their recovered data streams. The right diagram displays the measured eye diagrams of both data streams [4].

the two carrier frequencies being 30GHz and 50GHz. These all are directly interfaced to two transformers, which couple the RF signals to the shared differential transmission line. On the receiver side, the shared differential line is connected again to two transformers which decouple their respective RF signal to their RF receivers. The RF receivers utilize mixers to down-convert their RF modulated signals back to their original baseband data. The transmitted baseband data passes directly through both sets of transformers by always staying on the secondary coil side. The main contribution here was amplitude-shift keying (ASK) was being used as the RF modulation scheme non-coherently. One can verify is not synchronous as there is no clock line that connects the transceivers like there is in Figure 2.7. Non-coherent ASK demodulation is realized through envelop detection and does not require any of the power-hungry frequency synthesizers or extra clocking circuits.

For what advancements were made in the context of the RF interconnect, there are a few reservations that can be made about it. First, looking at Figure 2.10, we see the eye diagram of each band. The baseband eye diagram (in yellow) is quite clean, but the two RF diagrams somewhat dubious. The 50GHz eye is measured to be only 80 mVopen in a lab setting,
which suggests it would be hard to implement in less controlled (i.e. noisy) environments. Secondly, the bit-error rate (BER) was measured at $1 \times 10^{-9}$, which is $10^3$ higher BER than the paper performances that was compared against. Also worthy of note was an on-chip transmission line was used which is typically viewed as a more ideal situation to drive and has less attenuation than off-chip lines, so the BER should likely have been lower.

What distinguishes [5]'s work from the interconnect proposed in this research is visible in the details. In other words, the same general topology is the same, but the implementation of each circuit block is slightly different. While it is difficult directly compare the circuit block implementations due to lack of circuit information in [5], but it is known that Tam et. al used a 90 nm process with two RF bands at 50 GHz and 30 GHz. The proposed thesis work utilizes a 130 nm process with RF bands at 10 GHz and 20 GHz. Again, circuit implementations are hard to compare, but one possible circuit implementation that is likely different is the mixer, which is the RF front-end. The mixer in this work is passive and does not require a current mirror.

Years following, [9], [10] and [11] matured the dual-band interconnect (DBI) which was composed of one RF band data stream and a baseband data stream. All three publications were able to efficiently ($\leq 4pJ/b$) demonstrate high bandwidth transceivers capable of 8Gb/s throughput. What made these publications more practical for contemporary use than [5]
Figure 2.10: The recovered data and eye diagrams of the tri-band interconnect. Eye diagram measurements of each signal from left to right: 50GHz, 30GHz, baseband [5].

was these transceivers were communicating through an off-chip PCB line, which is also more difficult to drive than an on-chip line. This has many implications because most commercial chips communicate chip-to-chip through an off-chip transmission line. In Kim, et.al [11], they were able to demonstrate DBI communication through a single transmission line (i.e. not differential), which is quite promising especially for serial and backplane applications since pin count is an especially concerning issue for chip designers. When looking from the metric of Gb/s/pin, we see that [9,10] amounts to 4.2Gb/s/pin since differential transmission lines were used. Driving the signal across one line requires slightly more power consumption (4pJ/b vs. 2.5pJ/b) and more susceptible noise (1 × 10⁻¹² BER vs. 1 × 10⁻¹⁵ BER) than a differential transmission line. The die photo and PCB test board is shown in 2.11. Notice the circular transformers in the die photo that are close to pads and have the RF transceiver label in their circle.

2.4 ASK Modulation

There are numerous digital modulation schemes to choose from for RF communication but two have been extremely successful in this RF interconnects: binary phase-shift keying
Figure 2.11: The recovered data and eye diagrams of the tri-band interconnect. Eye diagram measurements of each signal from left to right: 50GHz, 30GHz, baseband.

(BPSK) and amplitude phase-shift keying (ASK). Both schemes are ways of representing digital data by modifying one of the various parameters of a sinusoidal signal, which for BPSK is obviously phase and for ASK is amplitude. The quantitative symbol differences are expressed below. One can surmise that the transmitted modulated signals are analog because the channel is inherently analog. In other words, this signals are representative of logical values, but the signal is analog.

\[
\text{ASK Modulation: } s(t) = \begin{cases} 
A \cos(2\pi f_c t) & \text{binary 1} \\
0 & \text{binary 0}
\end{cases}
\]  
\quad (2.4)

\[
\text{BPSK Modulation: } s(t) = \begin{cases} 
A \cos(2\pi f_c t) & \text{binary 1} \\
-A \cos(2\pi f_c t) & \text{binary 0}
\end{cases}
\]  
\quad (2.5)

BPSK has been shown in [3,4] to successfully transmit through an RF interconnect. One can see from equation 2.5 that for a symbols '0' and '1' are represented by full amplitude
levels while in ASK, a ‘1’ is full amplitude swing while for symbol ‘0’ there is no swing as the signal should be switched off. This intuitively shows that there is a higher power cost per symbol for BPSK than ASK. The aforementioned past arts have only accomplished coherent BPSK demodulation. This type of modulation has traditionally utilized the Costas loop, which requires an additional power-hungry voltage-controlled local oscillator and two mixers. Non-coherent ASK demodulation has been previously demonstrated [5,9–11] to use less power than BPSK, but also requires less chip space and less design complexity. It is also of interest to state the modulators of both BPSK and ASK contain simple transistor configurations, which for ASK’s case will be shown in detail in the next chapter. From these considerations, ASK was chosen for power efficiency and design simplicity.
Chapter 3

Baseband and RF transmitter

3.1 Introduction

Transmitters are responsible for sending out the signal waveforms that propagate through the channel to the receiver. In this system, there are three transmitters for each of the three bands. The overall architecture of this can be seen in Figure 3.1. The first band is the BB signal, while the other two bands represented are RF1 and RF2, which operate at 10 GHz and 20 GHz, respectively. In this chapter, I will elaborate on the development and design of these three transmitters for a reconfigurable RFI system. All circuit designs were completed in an IBM 130 nm process.

In the first section, I will introduce the baseband transmitter through its architecture, circuit design, simulations and layout. In the sections following, I will show the generic architecture for an RF transmitter and will detail the designs of the two different bands. Since these transmitter designs require more background knowledge and have more variability in the design choices, I will consider the different options for this specific application. Following, I will give the specifications on the final design choice through circuit design, simulations, and layout.
3.2 Baseband Transmitter

The overall architecture of the BB transmitter consists of an input buffer, inverter chain and output driver. These components are depicted in Figure 3.2. The architecture of the BB transmitter is rather simplistic, but has worked well in past works [9–11]. It accepts differential input, but its output is single-ended. Because the transmission line is on-chip and only 4 mm, there isn’t any need for differential signaling for immunity against noise or small output swing. The main specification requirement for this block is output up to 5 GHz.

The first component block is the input buffer. The input buffer encompasses a NFET differential amplifier with a PFET active load that allows for differential input to single-ended output. The schematic of this circuit is shown in Figure 3.5 along with its open-loop frequency response. The frequency response shows that it can accept a differential BB signal at 5 GHz as the gain is a little less than 5 dB. This output signal is fed into an inverter chain (pictured in Figure 3.2) that is used to drive the output driver. A 50Ω was chosen to be the transmission line termination impedance, so therefore, the effective impedance of the resistor at the output driver is 50Ω. Figure 3.6 gives the schematic view of the output driver. One should notice that the output driver’s schematic is similar to an inverter. When the input data goes high, the signal discharges through the NFET and its 50Ω resistor. When the input data goes low, the signal discharges through the PFET and its 50Ω resistor. This
allows for the maximum worst case charge/discharge to be from 0V/1.2V to $\frac{1}{2}V_{dd}$.

The post-extracted simulation of the complementary input data and the BB transmitter output at 5 Gb/s may be viewed in Figure 3.7. The output swing of the BB output driver is more than enough (400 mV) to pass through the channel for full recovery at the receiver end. The layout of the BB transmitter is shown below the transient simulation in Figure 3.8. One item of interest is the layout of the resistors. It may be difficult to see in Figure 3.8, but the resistor layout is of concern to an IC designer. Resistors are well known to have up to ±20 percent error of their designed value after fabrication. The best way to combat this expected error is lay multiple resistors in parallel such that their equivalent resistance is the designed value. This method allows the resistor to take up a relatively larger amount of chip area, which there will be less error than a smaller form factor. Furthermore, these resistors should also receive the dummy resistors on the open sides of the parallel resistors.
Figure 3.6: Output driver schematic.

Figure 3.7: Transient simulation with the positive complement input signal on top and the output signal to the channel on the bottom.
Figure 3.8: The layout of the baseband transmitter.

Figure 3.9: The general RF transmitter architecture consists of a VCO, input buffer and an ASK modulator.

### 3.3 RF Transmitter

The RF transmitter architecture consists of the three blocks: free-running voltage-controlled oscillator (VCO), input buffer and ASK modulator. This setup is shown in Figure 3.9. The VCO’s output is the carrier signal that will be modulated by the input buffer’s output data. The ASK modulator accepts input from the VCO and the input buffer. In a sense, allows the carrier frequency to feed-through when the input buffer signal is high. This is the final output that is sent to the transmission line. This type of modulation is sometimes referred to as on-off shift keying.
For this design, we have to transmit at two different frequency bands, but each transmitter slightly different from a W/L ratio optimization standpoint and thus does not justify its own section. Fundamentally, both RF transmitters are the same and the specifications that differentiate them from another are their inductors and current biases. The frequency bands that are being used in this research are 10GHz and 20GHz. I will first go over the fundamentals of the two most common types of voltage-controlled oscillators (VCO): ring and LC. Both VCO types are relevant to this work and thus warrant their own study. Lastly, I will finalize the VCO design choice and show the whole transmitters’ post-extracted simulation along with its layout.

3.3.1 VCO

There are two common types of VCOs that designers prefer to use as their clock generators: ring-VCO and LC-VCO. In both cases, the VCO generates a sinusoidal signal with a DC bias input, but the signal generation is accomplished differently for ring and LC-type VCOs. The name, voltage-controlled oscillator, comes from the fact that the input DC bias controls the frequency output of the VCO. The ring-VCO is able to generate its signal through a positive feedback system, while the LC-VCO utilizes a LC tank. For practical purposes, the VCOs of this work be biased at the ground rail to keep the design straightforward. It should also be noted there are whole dissertations devoted just one specific VCO and the objective of this section is by no means intended to be an exhaustive study of VCOs. Instead, its objective is go over the textbook fundamentals of each, but with the right amount of brevity so that a designer has the basic tools to quickly design a functioning VCO that is intended to interface with the other RFTX blocks.

3.3.2 Ring-VCO

The ring-VCO is a cascade of odd-integer staged feedback system. In order for oscillation to occur, the feedback system must exhibit positive feedback such that the output signal continually grows over time. The qualitative representation of a generic negative feedback system is shown in Figure 3.10 and its closed loop transfer function is
However, it is possible at high frequencies (that usually occur after the 3dB roll-off) for the magnitude of $H(j\omega)$ to become negative and effectively changing the subtraction process to an additive one, which yields a positive feedback system. For the case when $\beta$ is $-1$ and $H(j\omega)$ is $-180^\circ$ that the transfer function approaches infinity which yields the unstable case. The unstable case gives an oscillation at $\omega_0$, which is the noise component. The conditions to initiate oscillation are well known and are called Barkhausen criterion [6]:

$$|\beta H(j\omega)| \geq 1 \quad (3.2a)$$

$$\angle H(j\omega) = 180^\circ. \quad (3.2b)$$

These criterion need to be met for all VCO types. Equation 3.2a may seem obvious for the system for oscillate, but 3.2b is needed so that the “phase shift ensures the feedback signal enhances the original signal“, where the word ”enhances“ describes the additive process at the input due to the sign change [6]. Equation 3.2b can occur when thinking about a Bode plot with magnitude and phase for an closed-loop system. This second criteria is a frequency dependent phase shift that occurs when the phase has reached $-180^\circ$ (i.e. signal inversion) before the magnitude has reached the unity gain crossover point. This is where the subtractive process in a closed loop system becomes a additive function. Once this second criteria has been satisfied and at the same time one has a gain being greater than or equal to 1 (i.e. above the unity gain), we receive the oscillation state.
There have been a number of different ring-VCO elements that have been researched throughout the years, but they usually include no more than eight transistors. So, when picturing the size of a ring-VCO as a whole with all of its elements combined, the oscillator is really not too large relative to other components blocks in a system. The advantage that sets a ring-VCO apart from a LC-VCO is that no passive components (i.e. inductors and capacitors) are needed and this is why it is substantially smaller than a LC-VCO.

In context of this research, a ring-VCO is ideal because it is smaller in size than an LC-VCO and it consumes less power than an LC-VCO [17]. One of the main disadvantages of a ring-VCO is that it is susceptible to phase noise. This issue is definitely of concern when one is designing synchronized circuits, however, our transceivers are non-coherent and thus this characteristic does not affect our design. The other main disadvantage that could affect our design is the tuning range of a ring-VCO is much wider than that of an LC-VCO [18]. What this could mean is that our carrier frequency could deviate from its originally designed value more than the LC-VCO from the effect of to noise in the voltage supplies/bias. So, one must ensure that their design combats this with large capacitors connected to the supply rails.

It should be noted that the ring-VCO can only be used for the 10GHz transmitter because its frequency is limited to about 10GHz and therefore can’t provide a 20GHz signal. This frequency limitation is typical for a 130 nm CMOS technology.

As a designer, I was aware of all the advantages and disadvantages of the ring-VCO. I accounted for the noise issues in my design so that this would not pose a problem. One issue I did not anticipate were the inherent capacitances in the layout. This was the difference in not being able to incorporate a ring-VCO in my 10 GHz transmitter design. The ring-VCO schematic transient simulation was capable of outputting a carrier frequency of 10.5 GHz. After layout, a post-extracted transient simulation showed this carrier frequency dropped to 8.25 GHz. Even after enhancing the layout to remove these capacitances, the carrier frequency increased a negligible amount. This lowered frequency is not capable of modulation at a data rate of 4 Gb/s. Therefore, this design could not be used, but demonstrated a valuable lesson in integrated circuit design process; one should always attach line capacitances in their schematic designs to best model their layout performance when designing high frequency circuits. This design process was a good learning experience in terms of designing
the ring-VCO, but more in a general IC design lesson.

### 3.3.3 LC-VCO

The main alternative to the ring VCO is the LC-VCO. The fundamentals of LC-VCOs can be realized by revisiting our basic circuit theory course (See Figure 3.11). If we remember that an ideal LC tank behaves like an oscillator whereby the inductor and capacitor trade energy continuously to create a sinusoidal output voltage at the natural frequency $\omega = \frac{1}{\sqrt{LC}}$. However, in the lab setting, the oscillations were damped since these passive components are not ideal. The damped behavior present in the circuit can be modeled by a resistor as seen in Figure 3.11. These resistances were present in the inductor, capacitor and wires. So, the question is what kind of FET configuration can allow for an LC-VCO to oscillate without being damped?

If we add the non-ideal LC tank at the output of a cross-coupled paired NFET, we should get something like Figure 3.12. Cross-coupling describes a transistor configuration, where differential signaling is used and the output drains of each transistor are connected to the input gates of their half-circuit counterparts, which is illustrated in Figure 3.12. Using the small signal half-circuit model of a crossed-coupled NFET, the output resistance can be derived to be $-\frac{1}{g_{m}}$, which can be thought of as a negative resistor. This negative resistance can only be created by active devices with a positive feedback configuration (cross-coupled configuration) to effectively cancel the damped effects that come from the resistance of the inductor and capacitor.

This finding assumes that both transistors are exactly the same. This has to be true in order to achieve equal output resistance. Additionally we satisfy Barkhausen condition...
Figure 3.12: The LC-VCO schematic that models the resistances present in the inductor and capacitor.

at 3.2b because the half-circuit common source contributes a 180° phase shift added to the tank’s phase shift of 180° [6]. In conclusion, we may now simply represent the LC-VCO without its parasitic resistances as illustrated in Fig. 3.13.

An alternative way to see that the cross-coupled oscillator does indeed oscillate is to examine the open-loop voltage gain and phase of the circuit in Fig. 3.14. This technique and example was originally exhibited in [6]. The outputs, $V_1$ and $V_2$ are differential, but analyzing this circuit we see that these outputs are cascaded common-source amplifiers. Hence, the total loop-gain becomes $g_{m1}R_p g_{m2}R_p$. The output resistance $r_o$ is much, much greater than the parasitic resistance of $R_p$, which it is in parallel with at the small signal model. Therefore, we are not concerned about its effect on the overall gain because $R_p$ will dominate the equivalent resistance. Graphing the magnitude and phase of the cross-coupled oscillator, we receive the graphs of Fig. 3.15. Note that $|H_x|$ represents each half-circuit gain because both halves are the same. The phase shift makes sense because 90° represents
the inductor leading and $-90^\circ$ is represent the lagging of the capacitor. Recognizing that the total loop-gain of the circuit is the product (as previously mentioned above) and when multiplying magnitude together, their phase angles add to be $\pm 180^\circ$. This closed-loop system satisfies equations 3.2 because $|H_xH_x|$ (i.e. magnitude of $gm_1R_pgm_2R_p$) is greater than 1 and a total phase shift of $360^\circ$ occurs at $\omega_0$.

Finally, the last part of the LC-VCO to determine is the natural frequency of oscillation. Thus far, we have only determined whether our circuit will produce oscillations. In order to derive the natural frequency, we must utilize the small signal model of the circuit. Looking at Figure 3.14, we can see that the circuit is symmetrical. Therefore, it is appropriate to use the half-circuit small signal equivalent model of the LC-VCO, which is illustrated in Figure 3.16. One may now implement KVL at $V_{out}$ and $V_L$ to get the nodal equations which yields:

$$g_mV_{in} + \frac{V_{out}}{r_o} + sCV_{out} + \frac{V_{out} - V_L}{R} = 0$$  \hspace{1cm} (3.3a)  

$$\frac{V_L - V_{out}}{R} + \frac{V_L}{sL} = 0$$  \hspace{1cm} (3.3b)

Solving for $V_{out}$ and dividing out $V_{in}$ to find the transfer function to be the following:

$$H(s) = \frac{V_{out}}{V_{in}} = -\frac{g_m r_o s (R_p + sL)}{s^2 r_o CL + s(L + r_o CR_p) + r_o + R_p}$$  \hspace{1cm} (3.4)
Figure 3.14: The LC-VCO schematic with non-ideal LC tank, where $R_p$ represents the parasitic resistance from the inductor and capacitor.

Figure 3.15: Magnitude and phase of the cross-coupled circuit gain for the individual stage (left figure) and open-loop for the whole circuit (right) [6].
From here, we should solve the denominator for the natural frequency. The first step is to substitute \( s \) for \( j\omega \). Following this substitution, we only are concerned with the real part and we can ignore any imaginary units. In this instance of equation 3.4, the \((j\omega)^2\) becomes \(-\omega^2\) and we may ignore the middle term of the denominator because it is completely complex. Additionally, we may approximate this real component in the extremely likely case \( r_o \gg R_p \) and \( r_o \) becomes the dominant factor of the resistances. Given these conditions, we can now solve the denominator for the natural frequency.

\[-\omega^2 r_o C L + r_o = 0\] (3.5)

And solving for \( \omega \), we receive

\[\omega = \frac{1}{\sqrt{LC}}\] (3.6)

Thus, the oscillator configuration of Figure 3.14 has been analyzed effectively to meet our needs for a VCO in this thesis. The following section will show the applicable inductors for a standard 130 nm CMOS process.

**Inductors**

Compared to discrete inductors, integrated inductors are more compact. However, relative to all other components in a CMOS process, it is still one of the largest by area (but this can vary with one’s frequency specifications). The metric that describes an inductor’s resonant ability is the quality factor. The quality factor is the ratio of inductance to resistance multiplied by the angular frequency. A high quality factor is desirable as it exhibits a low amount of damping to produce oscillations. On-chip inductors do not possess the higher
quality factors of off-chip inductors because of the following reasons: the metal layer trace resistance, $I^2R$ losses in substrate coupled through the metal to substrate interface, and eddy currents in the substrate [19]. The good news is that the quality factor increases with frequency per its mathematical equation. For our purposes, a quality factor of 10 or more is sufficient to achieve a strong carrier frequency for our VCO.

On-chip inductors do not have the liberty of being designed in the z-axis as much as discrete inductors. In fact, many processes limit inductors to only a few layers within the supported metal stack. In the IBM 130 nm process, inductor layers are M7 and M8, which accounts for only two of a total of eight within the metal stack. So, for on-chip inductor design, many opt for a spiral inductor or multi-layer ring inductor. The latter was chosen for our design.

Figure 3.17 shows the 10 GHz inductor and the 20 GHz inductor together. The quality factor was highest for both band designs when a two-layer (M8 and M7), two-turn form factor was realized. It should be noted the second layer (green M7) was only used to cross over to the inner turn. Radially, the 10 GHz inductor measures 300 $\mu$m and the 20 GHz inductor measures 150 $\mu$m. Interestingly, these layouts look as if they consist of only one inductor, but our design requires two inductors as shown in Figure 3.12. In reality, each layout contains two inductors. This is possible is through the center tap, which is realized by the middle connection. If one follows the trace from the center tap to the right, we arrive at the left port. Similarly, we arrive on the right port by going left from the center tap. Each path represents one inductor and from this design we receive a very compact layout for two matched inductors. The quality factors of both of these inductors are plotted below the layout figures in Figure 3.17. The target carrier frequencies of 20 GHz and 10 GHz that achieves a quality factor of $\geq 10$.

3.3.4 ASK modulator

Implementing ASK modulation in a CMOS configuration is a relatively straight-forward design. In our mind, we can visualize ASK modulation as a carrier signal being able to pass to the output if our data is high, but should not be able to if the data is low. Translating this
Figure 3.17: The layout of both 10 GHz (a) and 20 GHz (b) inductor for the VCOs that will produce our carrier signal for the RF transmitters. The 10 GHz inductor has a diameter of 150 µm and the 20 GHz inductor has a diameter of 150 µm.

Figure 3.18: The measured quality factor of 10 GHz inductor shown in (a) and 20 GHz inductor as seen in (b).
visualization to a FET configuration, we can use them as digital gates. Figure 3.19 shows this configuration. The data stream outputted by the input buffer modulates the carrier by switching on/off the current flow through M3 and M4 to complete the ASK modulation [11]. The ASK modulated signal is then coupled to the transmission line through the transformer, which will be explained in detail in the following chapter. Notice the center-tap is connected to VDD, which makes sense because we want the FETs connected up through a metal line (the physical inductor material) to the top supply rail.

This design is very area-conscience as there are only six transistors, including the bias transistors (not pictured in Figure 3.19). A criticism of this design is that it requires a free-running VCO, which cannot be turned off to save power whenever the data is low. However, this is a better alternative than a complicated, power-hungry synchronized transmitter which would require a free-running VCO anyways.

Following the FET configuration, one should test operation of the ASK modulator and VCO together such that the expected modulated carrier signal is truly the end result. To elaborate, the designer wants to insure that output modulated carrier of the signal is at the carrier frequency and that spurious harmonics do not interfere with the outputted signal. The one way to accomplish this is to view the output spectrum of both the VCO and ASK modulator output, which can be viewed in Figure 3.20.

Figure 3.19: The ASK modulator FET configuration.
This section encompasses the final layout and its post-extracted simulation of both RF transmitters. Beginning with the final layouts, we can view the 10 GHz transmitter in Figure 3.21 and the 20 GHz transmitter in Figure 3.22. A table with each of transmitter’s final metrics are listed below the set of layouts.

<table>
<thead>
<tr>
<th>Transmitter</th>
<th>Active-Area Dimensions</th>
<th>Power consumption</th>
</tr>
</thead>
<tbody>
<tr>
<td>Baseband</td>
<td>67 µm by 38 µm</td>
<td>5.41 mW</td>
</tr>
<tr>
<td>10GHz</td>
<td>45 µm by 134 µm</td>
<td>2.51 mW</td>
</tr>
<tr>
<td>20GHz</td>
<td>45 µm by 134 µm</td>
<td>2.50 mW</td>
</tr>
</tbody>
</table>

Table 3.1: Transmitter metrics. Note Baseband layout is shown in 3.8
Figure 3.21: 10GHz Transmitter layout with final dimensions.

Figure 3.22: 20GHz Transmitter layout with final dimensions.
Chapter 4

Transmission Line and Transformer

4.1 Introduction

This section will be composed of information about the on-chip transmission line (Tline) and the band-selective transformers. The first part will characterize the Tline for insertion loss. The second section will be devoted to the line termination impedances for the three bands. The RF bands cannot be resistively terminated and instead require a transformer. The transformer is particularly of interest because it acts as an impedance matching circuit that couples/decouples the RF bands to/from the Tline. The focus will mainly be on the design and characterization of transformer.

4.2 Transmission Line

The Tline is the communication link between the transmitter and receiver. As previously discussed in Chapter 2, harmful affects such as ISI, attenuation due to dielectric losses and skin effect modifies the transmitted signal as it propagates down the Tline. In many high frequency applications (i.e. tens of gigahertz of frequency), the Tline can be the limiting factor of the system. Therefore, it is extremely prudent to analyze the insertion loss of the Tline. This means one must complete frequency domain analysis for the range of frequencies one plans to introduce into the Tline. With this information, one can see the amount of loss for a given frequency at the specified Tline length.
Alexander T. Dilello  Chapter 4. Transmission line and Transformer

Since, this is an on-chip Tline, one has control over a few parameters. I had the liberty of choosing the length, width, metal layer, signal protection and characteristic line impedance. With these options in mind, the best principles for a high-frequency Tline utilize a ground-signal-ground connection. This means that the main signal is placed in the center and shielded by outer ground/DC connections. This helps prevent external electromagnetic influences from corrupting the signal in transmission. The translation of this configuration to a CMOS process is not problematic because our process design-kit affords a metal stack. Figure 4.1 depicts the shielding of the differential Tlines. The top metal layer M8 will act as a ground shield, the second layer from the top M7 will be the signal layer plus adjacent ground shields, and finally, M6 will be the bottom grounding shield. All shielding/grounding layers are connected through vias.

For this analysis, I first made use of the planar EM software called Momentum, whose specific purpose is for characterizing passive circuits. A Momentum simulation incorporates the process design kit substrate and metal stack to extract the scattering parameters from a physical design. In our case, we want to extract the insertion loss (i.e. the S21 scattering parameter) of the Tline. For a simulation setup, one must take in some considerations. First, the computational resources of the computer are finite. Therefore, it is not prudent to layout a 50 µm by 3 mm dimensional Tline and expect a full electromagnetic characterization in a reasonable amount of time. Remember our minimum size in this CMOS process is 130 µm and this software takes into account of multiple metal layers configuration, material properties.
(e.g. metals materials like Al and Cu, SiO2, passivation layer) and the overall dimensions. For an accurate simulation, which usually requires a small mesh size, one must be aware of these factors.

The best approach for extracting the insertion loss from the Tline is characterizing a small, repeatable section because the Tline’s physical layout is made of the same \( x \) repeatable sections. For our case, since our Tline is 4 mm, I created a 100 \( \mu \)m subsection. I characterized it for its insertion loss, then combined 40 of these subsections to effectively model the 4 mm Tline. The insertion loss of the on-chip Tline can be viewed in Figure 4.2. This insertion loss is very slight compared to off-chip, which may experience attenuation of about \(-7dB\) at 20 GHz [10]. The final dimensions of the differential signaling Tline can be viewed in table 4.1. The parameter not included in the table is the space between the signal line and the two adjacent grounding lines is 20 \( \mu \)m. Additionally, it should be noted that the metal layer thickness not a modifiable parameters as this set by the process design kit.

Another interesting result is the LC response around 20 GHz. In this region, the single-pole RC response does not dominate. This type of LC response of an on-chip Tlines is well documented [3, 20–23]. One may use the Tline geometry to exploit the LC region to accommodate for higher frequency signals as in [3], which used wire widths of 16 \( \mu \)m. Conversely, [23] did not need to accommodate for higher frequency signals, so they were able to use 0.4 \( \mu \)m widths which rendered the LC region unusable at \(-60dB\) at a frequency of 10 GHz. In terms of this effect on the two RF bands at 10 GHz and 20 GHz, we should expect both to attenuate at approximately the same amount.

<table>
<thead>
<tr>
<th>Signal</th>
<th>Width</th>
<th>Metal Layer</th>
</tr>
</thead>
<tbody>
<tr>
<td>Top Ground</td>
<td>112.5 ( \mu )m</td>
<td>M8</td>
</tr>
<tr>
<td>Signal</td>
<td>10 ( \mu )m</td>
<td>M7</td>
</tr>
<tr>
<td>Adjacent Grounds</td>
<td>4.2 ( \mu )m</td>
<td>M7</td>
</tr>
<tr>
<td>Bottom Ground</td>
<td>112.5 ( \mu )m</td>
<td>M6</td>
</tr>
</tbody>
</table>

Table 4.1: Final geometrical transmission line parameters. The spacing between the M7 lines of Signal and Adjacent Grounds are 20 \( \mu \)m.
As for the baseband termination (specifically digital), one should consider the rise-time of the digital pulses. As stated in [24], one must terminate the transmission line when the round-trip propagation delay of a signal is larger than your signal rise-time, or reflections will occur. A rule of thumb suggests that rise-times of 100 ps need to be terminated on TLines longer than 3 mm [24]. This means that our baseband signal should be terminated since the rise-times of our signals will likely be \( \leq 100 \text{ ps} \) and our TLine will be 3 mm in length. For this chip, I know that past RF interconnects used a 50 \( \Omega \) termination successfully, so I have chosen 50 \( \Omega \) to be on-chip termination.

### 4.3 Transformer

A designer should be cognizant of transmission line effects when a signal whose wavelength is considerably less than the length of the line. One should know from their electromagnetics course of our undergraduate curriculum that these effects introduce phase delays at various places along the line that ultimately result in reflections that interfere with the transmitted signal. So, for our 20 GHz signal, its quarter wavelength in free space is 3.75 mm. However, this signal is not propagating in free space, therefore, taking a conservative velocity factor or 0.66, our quarter wavelength of the 20 GHz signal becomes 2.48 mm, which is less than the line between the receiver and transmitter. Furthermore, the skin effect can modify the TLine impedance for the AC components due to the signal propagating on the outside of
the conductor. Thus, one should strive to achieve good impedance matching. Furthermore, for proof that the tri-band interconnect that could later be extended to off-chip lengths (i.e. would affect baseband signaling and the 10 GHz frequencies), one should also strive to achieve a good impedance matching network.

The objective of our transformers should effectively couple/de-couple its respective RF signal and isolate it from its counterpart RF band signal and the baseband signal (DC blocking). Furthermore, its purpose is not just for filtering, but also impedance matching to the transmission line and the input impedance of the receiver. The analysis of an impedance matching network for a transformer based receiver has been thoroughly described in [10]. In essence, the transformer must exploit its inherent resonance effect to match the input impedance to the receiver, which for this design, resonance should be at 10 GHz or 20 GHz. In the analysis of the impedance matching transformer in [10], it showed that the input seen at the secondary coil in parallel with the input impedance at the receiver matches well at resonance. In fact, the transformer properties can actually boost the RF signal from a 50 Ω transmission line. The same analysis applies to impedance matching between the transmitter and the channel.

The transformer is interfaced between the channel and receiver as seen in Figure 4.3. However, the physical layout of the transformer is definitely different than what the schematic view implies, which is shown for the receiver side only in Figure 4.4. As no surprise, the transformer looks geometrically very similar to the inductor that was covered in a previous chapter. The diameters of each transformer (specifically the primary coil that uses the top metal layer) are exactly the same as the diameters of the VCO’s inductor to provide extremely good band matching. The secondary winding’s geometry provided adequate inductance for impedance line matching and good coupling between metals M8 to M7.

The physical transformer layout of Figure 4.3 can be viewed in Figure 4.5 with their diameter dimensions. This layout is placed both ends of the TLIne to couple/decouple the RF signals. Each transformer is center-tapped to $\frac{1}{2}V_{dd}$, which I will refer to as $V_{term}$. This is the termination voltage at the end of the TLIne which sets the RF common-mode input voltage to the receiver. Additional transformer characterizations like coupling ratio and quality factor for both primary and secondary coils of the 10 GHz coupler (i.e. 300 µm
Figure 4.3: Schematic view of transmitter and receiver side transformers.

Figure 4.4: Physical top view of a transformer interfaced between channel and a RF receiver.
The same parameters for the 20 GHz filter can be viewed in Figure 4.7. Additional physical parameters used for both transformers are 5 µm spacing between windings and 10 µm widths.

One can see that for both coils within both transformers well exceed a quality factor of 10 at their particular frequency band they were designed for filtering. Having a quality factor of 10 or greater for a desired frequency is a good rule of thumb to follow. For the 10 GHz transformer, I was able to achieve an excellent coupling ratio of 0.8 at 10 GHz, which means there is only 20% loss in the transformation process. The 20 GHz transformer gets a coupling ratio of about 0.67, which is sufficient as long as the amplitude of modulated signal is on the order of several hundred millivolts (≥ 400 mV).

Finally, the last issue to consider is baseband transmission and how it is interfaced with the transformers. The configuration, illustrated in Figure 4.8, makes use of the center-tap on the secondary coil of the transformers to send the signal to the transmission line. The 20 GHz and 10 GHz get coupled on to the baseband signal and propagate down the TLine, where they are eventually coupled off at the receiver. The baseband signal will be terminated right before entering the transformer center-tap of the secondary coil. Note that the transformers are band selective and couple a negligible amount of (as seen by the quality factor) the baseband signal into their primary coils.
Figure 4.6: 10 GHz transformer quality factor for primary and secondary coil (left). Coupling ratio for the 10 GHz transformer (right).

Figure 4.7: 20 GHz transformer quality factor for primary and secondary coil (left). Coupling ratio for the 20 GHz transformer (right).

Figure 4.8: Baseband connection to center-tapped secondary coil of the transformers. Note the secondary coil is physically connected to the transmission line.
As a research ambition, I originally set out to design an interconnect capable of three simultaneous data transmissions utilize the three bands of operation (BB, 20 GHz and 10 GHz). As one may see from Figures 4.7 and 4.6 that the 20 GHz band filter is quite capable of coupling a 10 GHz signal into its primary coil. This was ultimately the bane of this ambition. In particular, this design needs further refinement to create greater isolation between RF bands. Therefore, only one band may be transmitted at the same time in this project.
Chapter 5

Baseband and RF Receivers

5.1 Introduction

This chapter explains the baseband and RF receivers in detail. The chapter will begin with the straightforward baseband receiver. The following sections will be devoted to the RF receivers. The design process, their layout and preliminary simulation results will be shown for all. The final post-extracted transient system results will be shown in the Chapter 5.

5.2 Baseband Receiver

As shown in the previous chapter, the insertion loss for a baseband signal on the 4 mm transmission line was $-1.5 dB$ at 4 GHz. When the single-ended output is $\geq 400 \text{ mV}$, this results in minimal loss. ISI is probably the only concern, however, this line is on-chip. So to charge/discharge at 4 GHz will not be a problem. Thus, there isn’t any special recovery circuits for the baseband receiver. It simply consists of a set of differential amplifiers at the front-end to get the signal close to rail-to-rail. Following this is a set of amplifiers that pulls the signal rail-to-rail, which outputs to an inverter chain that is capable of driving the output driver. Its physical layout can be seen below in Figure 5.1. The final simulation of this circuit in the following chapter for proof of concept.
The ASK modulated RF carrier (in both cases) is coupled off the on-chip transmission line through a transformer to filter it out from the baseband signal. The transformer acts as an impedance matching device for the RF signal where it is coupled into the secondary coil and fed into the mixer input. The RF receiver design features a mixer that is capable of non-coherent direct-down conversion to baseband from the carrier frequency. It is followed by a buffer converter with RC feedback for input common-mode equalization, which will be explained below.

The mixer was designed with low-power consumption as the first priority. Thus, we chose a non-coherent configuration because this demodulation method lacks power-hungry clocking circuits. The operation of the mixer follows the design principles of the Gilbert cell multiplier circuit [25], where in this instance the RF and LO inputs are the same carrier frequency. The intended consequence of this is the down converted baseband signal that originally modulated the carrier.

The mixer is fed with the ASK-modulated signal from the transformer into the four
inputs, M1 through M4, which is traditionally considered the RF input in the Gilbert cell architecture. At the same time, there is a high-pass path (through the capacitor) to the drains of the two differential amplifiers contained within the mixer and this signal is considered the LO input. The resulting output is the down-converted baseband signal that is fed into the pseudo-differential amplifier. You can think of the operation the mixer performs as an envelope detector because simply to get the digital baseband signal, you follow the envelope of the ASK-modulated signal.

From a more generalized quantitative standpoint, the multiplicative process performed by the Gilbert multiplier begins with the two sinusoidal input signals that will be multiplied together. These two signals can be described simply as:

\[ s_1 = A \sin(\omega_1 t + \phi_1) \] (5.1)
\[ s_2 = B \sin(\omega_2 t + \phi_2) \] (5.2)

Using product-to-sum trigonometric identity and rearranging factors we receive two outputs that occur two different frequency components:

\[ s_1 \times s_2 = \frac{AB}{2} \left( \cos((\omega_1 + \omega_2)t + (\phi_1 + \phi_2)) - \cos((\omega_1 - \omega_2)t + (\phi_1 - \phi_2)) \right) \] (5.3)
One can see from above that two signals are created; one signal is in the additive form and the other in the subtractive form. Furthermore, if we use two modulated signals that are the same, we could theoretically convert directly down to baseband. One can qualitatively confirm this with an relevant example involving ASK modulation. One can see the differential transient waveform of an ASK-modulated signal on the left in Figure 5.3. This would be the signal feed into the receiver inputs. For this example, the data rate is 1 Gb/s and the carrier signal is 20 GHz. The positive component of differential signal gets multiplied at the mixer by its complement (i.e. negative component). This product should produce the baseband signal at the mixer output. One can see the spectral output of this operation on the right in Figure 5.3 and it effectively shows the two frequency components, one of which is the baseband. The second component at 40 GHz will be filtered out by subsequent differential amplifiers that act as low-pass filters.

Applying this concept in the context of our receiver, we see that both signals will be the same signal. In other words, both signals will be the ASK modulated RF carrier signals. Therefore, the additive signal that has a frequency component of double the RF carrier frequency and a subtractive signal that will be the down-converted baseband signal. In this case will realize the original transmitted binary data. The additive signal will be filtered out in subsequent circuits blocks, while the binary baseband data will be amplified rail-to-rail for off-chip transmission.
The analysis and operation of the Gilbert multiplier is quite involved and this circuit alone has become the focus of whole theses and journals [25–28]. It is quite well-known considering there is a section for this circuit in [29]. This work will allow for these references to speak for themselves and will not re-hash an exhaustive analysis for this purpose. However, we can qualitatively see in Figure 5.2 that the current flowing into \( V_{O1} \) \( (V_{O2}) \) is made up of current contribution \( I_1 \) and \( I_3 \) \( (I_2 \) and \( I_4) \), where each \( I_z \) refers to the current that flows into transistor \( M_x \). The differential current output can be referred to as \( I_{out} = I_{O1} - I_{O2} \). As shown in [26], all currents can be related to yield a differential output current:

\[
I_{out} = V_d \left( \sqrt{K(2I_T1 - KV^2_d)} - \sqrt{K(2I_T2 - KV^2_d)} \right)
\]  

\((5.4)\)

where \( V_d = V_{in1} - V_{in2} \). And in equation 5.4 we have differential multiplication, which is exactly the function that we are looking for in this circuit. Finally, one can see that connecting \( R_1 \) and \( R_2 \) at the current outputs allows for differential voltage output.

The output of the mixer is differential, but these complementary output signals have two different common-modes due to the method of down-conversion used by the mixer. This presents a problem for subsequent differential stages considering each side will have a different common-mode. The remedy for this can be solved by a simple inverter with RC-feedback to control the overall gain while settling on a single common-mode voltage. One can see the single-ended schematic of the buffer converter in 5.4 and the initial settling transient waveform in Figure 5.5. I refer to this configuration as the buffer converter. One observation from the transient waveform of Figure 5.5 is that the initial settling time is about 40 ns, which means that first 40 ns worth of data is not recoverable. Thus, one would have to have an initialization period before any data of significance should be sent.

The settling time is related with the values of RC feedback network, which are physically represented in layout (Figure 5.6). A large resistance of 700 kΩ with a 30 fF capacitor was chosen simply because of the best transient response. The impedance tolerances of both passives are loose and will not affect the overall operation after tape-out even when their values are ±20% of the originally designed value.

The mixer’s layout of both bands may be viewed in Figure 5.7. Their design is exactly
Figure 5.4: Schematic of common-mode voltage buffer.

Figure 5.5: The initial settling of a common-mode voltage from the buffer converter.
Figure 5.6: Layout of the buffer converter. Even for a polysilicon resistor, 700 kΩ results in quite a large size.
the same without any modifications for a particular band. Since this mixer’s function is to demodulate, the best way to couple the RF signal to the drain for multiplication is the metal-insulator-metal (MIM) capacitor, which is represented in the green metal layer M7. MIM capacitors are regarded as the highest quality capacitors for RF and mixed-signal applications. A 100 fF MIM capacitor for our design. The full layout of the receiver is shown in Figure 5.8. Its dimensions are 172 µm by 235 µm. The other components following the buffer converter are multiple differential amplifiers, an inverter chain and a 50 Ω output driver. These are implemented to bring the recovered signal rail-to-rail for driving off-chip for analysis.
Figure 5.8: The receiver layout. The capacitor of the current mirror and the resistors of the buffer converter make up a lot of the height of the receiver.
Chapter 6

Final Results and Conclusion

Thus far, I have presented the individual components that make up this interconnect from transmitter through receiver. Here I will include the post-extracted simulation and the final thoughts about this research and future opportunities.

6.1 Final Simulations and Tests

For any RF design in a CMOS technology, one must accept that there are parasitics in every design and must accommodate their research for these effects. One can lessen the shock of their effects by automatically attaching 10 fF capacitors on their schematic nets to model the parasitic capacitances of the physical nets in layout. Furthermore, when a whole block (e.g. transmitter, receiver, etc.) is laid-out, one should post-extract the RC parasitics and re-simulate with these parasitics. A lot of times, the post-extracted simulation will reflect a 90% accuracy in its operation. If one designs RF or high-speed components in schematic view and tapes-out this design without post extraction, that designer can pretty much guarantee the component will not work as anticipated. Therefore, we must layout our design, post-extract simulate and refine for better performance.

Below in Figure 6.1 shows the post-extracted, final simulation of the reconfigurable tri-band interconnect for each band (i.e. baseband, 10GHz RF and 20GHz RF). From the top to bottom, I show the input data, the ASK-modulated signal and finally the recovered transmitted data. The left column consists of the 10GHz RF band, while the right column
shows the 20 GHz RF band. The data rate of the 10 GHz band is 2.5 Gb/s and the data rate of the 20 GHz band is 2.2 Gb/s. Unfortunately, these values were not as large as I would have liked them to be, but they do operate over 2.0 Gb/s. The post-extracted simulation data rates were largely limited by the aforementioned parasitic capacitances that exist in layout. Also, remember that these post-extracted simulation results for RF bands can only occur when one RF band is transmitting and the other must not be transmitting.

The post-extracted simulation waveforms of the baseband signal can be viewed in Figure 6.2. I was able to achieve a data rate of 4 Gb/s for the baseband signal. Here, parasitic capacitances weren’t much of a problem. What really benefits the baseband is the output swing at the input of the receiver is large (as evidence of minimum loss) compared to the output swing at of either RF band at the input of the mixer. This makes recovery a lot easier, especially considering there isn’t any additional steps like down-conversion that is needed.

The final layout of the reconfigurable tri-band interconnect is shown in Figure 6.3. The view allows one to have a global perspective in how everything is interconnected. The dimensions of this chip are 2.25 mm by 1.5 mm including the pads. The following figure of 6.4 is the physical chip that we received back from the foundry. Below in Figure 6.5 we can see the PCB test board.

6.2 Ending Remarks, Critiques and Future Plans

As for this research, I have successfully simulated through post-extracted simulation (which requires full top-chip layout) a reconfigurable tri-band interconnect. In a future tape-out, I plan to submit this chip for testing. The potential of this chip cannot be understated for its viability of future Network-on-chips.

Had this design worked as expected in the post-extracted simulation, it still would not be without its flaws. My original plan for research was to be capable of transceiving three data streams simultaneously and exceeding [5]’s data rate. As the reader now knows, this did not occur. The biggest setback I received from that was isolating both RF bands from each other. I have since discovered more enhanced transformer designs than what was deployed
Figure 6.1

(a) 10GHz carrier: (top) input data at 2.5 Gb/s, (Middle) modulated carrier and (bottom) recovered output data.

(b) 20GHz carrier: (top) input data at 2.2 Gb/s, (Middle) modulated carrier and (bottom) recovered output data.
(a) Baseband: (top) input data at 4 Gb/s and (bottom) recovered complementary output data

Figure 6.2
in this system. I believe I could have made this three simultaneous streams more possible had I separated the RF bands out further. For instance, instead of using a 10 GHz band and a 20 GHz band, maybe I could have used 10 GHz and 30 GHz. I believe would have made filtering a lot easier for me. Another idea would be to utilize higher frequency bands because their inductor and transformer designs have a smaller chip footprint.

As for future research plans, I would like to take up my suggestion and spectrally separate the RF bands further apart and use more enhanced filtering designs in the transformer. Additionally, the baseband signals seem largely unaffected by the transmission line given that there is good impedance matching. Therefore, I would like to implement multi-level signaling like PAM.

Research is a lot like the journey of life in that one does not always arrive at the originally anticipated destination. But overall, the knowledge I have gained from this project and the writing of this thesis to collect my thoughts has been second to none. I have learned a lot about digital, analog and RF integrated circuits. I have characterized many active and passive devices. This has especially reinforced all my undergraduate educational curriculum.
and has bridged many of the courses’ topics together that seemed like isolated islands of information before. I really enjoy seeing their applications and finding how they are all inter-related within this project. This experience has been especially rewarding me and has prepared me well for more upcoming experiences in my Ph.D.
References


