Menu contact home

Chapter 2
Modelling SOI waveguides

The propagation of light through silicon waveguides involves the complex interplay between a variety of physical effects. In this chapter, a model for this propagation is developed.

Whilst Maxwell’s equations can be integrated directly [96], such an approach is time consuming and requires vast amounts of memory. A 3D array of volume elements must be stored in its entirety, with each being represented by a set of twelve numbers (the vector components of both the electric and magnetic fields, plus back-values) [96]. For a dispersive nonlinear system, further vectors representing the dielectric polarisation must be added [96]. The simulation must also use a very large number of time steps, as there must be sufficient resolution to accurately describe the form of each wavecycle, and yet up to 10,000 such cycles may be present over the full length of propagation.

The system can be greatly simplified by reformulating it into a Nonlinear Schrödinger Equation [8239798]. This is scalar and one dimensional, in that the waveguide is represented by an amplitude factor, which is a function of the distance along the waveguide. This reduction, however, does not amount to ignoring the vectorial nature of the fields, or their variation perpendicular to the direction of propagation. Instead, it uses the effective index approximation, in which the amplitude is a scaling factor for the vector mode.

The Nonlinear Schrödinger (NLS) equation has another great advantage in that instead of describing the electric field itself, it describes a slowly varying envelope which modulates a fixed frequency carrier signal. This allows the step size in any integration scheme to be greatly increased, as the rapid oscillation of the carrier wave has been removed. More importantly, the second derivative of this envelope function with respect to distance of propagation is negligible, allowing it to be removed. This slowly varying envelope approximation reduces what would be a second order equation (in distance) into a first order equation.

The equation itself is given by [8]

∂A      ∂2A
∂z-+ iβ2∂t2-= iγ|A |2A
(2.1)

where A is the amplitude of the envelope, z is distance along the waveguide, and t is time. The group velocity dispersion (GVD) of the waveguide is specified by the constant β2 (which is described in section 2.1.2) whilst the Kerr nonlinearity is specified by the constant γ (which is described in section 2.2.2).

In order accurately represent an SOI waveguide, the ideal NLS equation must be substantially modified. Firstly, the NLS equation assumes a simplified model of dispersion, in which the relation between frequency and wavenumber is fitted with a parabola (where β2 is the quadratic coefficient). This approximation is sometimes acceptable for optical fibre [8], but in SOI it becomes inadequate [5799100101]. Therefore, the NLS must be extended into a Generalised Nonlinear Schrödinger equation, which includes higher order derivatives. The other main deviation from the ideal NLS model is caused by damping. Both linear damping (due to light escaping from the waveguide) and nonlinear damping (due to two-photon effects) will be present. Furthermore, the light will excite free charge carriers, which themselves will absorb and scatter light.

In the following sections, the generalised NLS equation is both derived and extended to encompass the effects of damping and free charge carrier interactions.

2.1 Light in a waveguide

Maxwell’s equations in a dispersive and nonlinear optical material (for electric field ⃗E, displacement field  ⃗
D and magnetic field ⃗
H) are given by

              1  ∂⃗H
∇ × E⃗  =  – --2----                         (2.2)
             ε0c ∂t
∇ × H⃗  =  ∂-⃗D                               (2.3)
            ∂t
where ∇⋅⃗D = 0 and ∇⋅⃗H = 0. We have assumed that there are no free charge carriers or currents, a fact which is actually false for silicon; however, we will explicitly reintroduce the role of free charge carriers in section 2.3.2. We have also ignored the effect of magnetisation, which is justified, as silicon’s relative permeability differs from unity by less than 105. We can eliminate the magnetic part of these equations entirely to give a second-order wave equation of the form
  2⃗   -1--∂2⃗D-    (   ⃗ )
∇  E = ε0c2∂t2 + ∇  ∇ ⋅E
(2.4)

When considering wave propagation in a single direction , it is convenient to rewrite equation 2.4 as

∂2E⃗           1 ∂2⃗D     (     )
---2 + ∇2⊥⃗E =---2--2-+ ∇  ∇ ⋅E⃗
 ∂z          ε0c  ∂t
(2.5)

in which the Laplacian operator has been split up as 2 2/∂z2 + 2, where 2 is the transverse Laplacian. This is a 2-dimensional Laplacian operator which acts of the plane perpendicular to . The 2/∂z2 operator corresponds to wave propagation along the axis, whilst the 2 operator gives diffraction.

2.1.1 Material dispersion

The displacement field is related to the electric field by

 ⃗     ⃗  ⃗
D  = ε0E + P
(2.6)

where ⃗P is the polarisation induced by the electric field. This polarisation is not instantaneous, and at optical frequencies the delay becomes highly significant. (Neither is the relation truly linear, a phenomenon of fundamental importance that we will introduce in section 2.2.) The polarisation at each point in space can be modelled as a sum of driven harmonic oscillators, each corresponding to a particular excitation in the material, which at resonance will cause absorption. (This connection between absorption and dispersion is highly notable. The two are inextricably linked by the need for causality, as is mathematically formalised by the Kramers-Krönig relations [102].) The equation of motion of the nth oscillator (in terms of P⃗n, its contribution to the total polarisation) is given by

 2⃗    ∂2⃗Pn-        2⃗
ωnPn + ∂t2  = ε0Δ εnωnE
(2.7)

where ωn is the resonant frequency, and Δεn is the extent to which the oscillator couples to the electric field (and thus its contribution to the relative permittivity). The Δεn term is actually a second-order tensor (where deviation from isotropy is known as birefringence). However the birefringence of silicon corresponds to a refractive index difference of less than 105 [103], and so Δεn is to a good approximation a scalar. The lack of any damping term means that these equations are only physically valid when the optical spectrum does not overlap with the fundamental resonant frequency ωn. Summing over these polarisations gives a displacement field of the form

⃗D = ε0ε∞ ⃗E + ∑  ⃗Pn +ε0χ(3)
             n        0
(2.8)

where we have also added a quasi-instantaneous term, with polarisation ε0(ε∞ – 1) ⃗E, where ε is the instantaneous susceptibility. This is formally the susceptibility at infinite frequency, but in reality it corresponds to the low-frequency tail of a much faster resonance. In silicon, it has a value (in the infrared) of ε = 11.6858 [46], but in some formulations this is replaced by the usual oscillator term [104105]. Physically, this term results from the sum of several high energy direct-gap transitions [106].

In response to an electric field of the form acos(βz – ωt)ˆx, these oscillators reach a steady state (with Lorentzian frequency dependence) of the form

      ε0Δεnω2
⃗Pn = aω2-–-ωn2-cos(βz – ωt)ˆx
        n
(2.9)

which implies a frequency dependent dielectric permittivity

           ∑   Δ εnω2
ε(ω) = ε∞ +    ω2–-ωn2-
            n   n
(2.10)

which in terms of free space wavelength λ = 2πc/ω and refractive index n = √-
 ε gives

       √ ---------------
              ∑  -Δεnλ2-
n (ω) =  ε∞ +    λ2 – λ2n
               n
(2.11)

Equations of this type are known as Sellmeier equations. The physical coefficients are well documented for both silica and silicon [46107], and are given in figure 2.1. It is, however, more accurate to say that these coefficients are curve-fitting parameters to experimental data, rather than the precise physical parameters of the polarisation oscillators. In silicon, the Sellmeier equation is often tweaked to improve this fitting [46], giving a modified formula of

        √ ----------2--------2
nSi(ω) =  ε∞ + -Δε1λ1-+ Δ-ε2λ2
               λ21 – λ2    λ2
(2.12)

For silicon at the wavelengths of interest, Δε1 is the dominant cause of dispersion, and is due to silicon’s 1.1 eV indirect band-gap [75]. The Δε2 term provides a small correction, which is highly detuned from any resonance. In silica, the Δε1 and Δε2 terms correspond to absorptions in the ultraviolet, whilst the Δε3 term corresponds to an infrared absorption (as can be seen from the corresponding values of λn).


Silicon Silica



ε 11.6858 1
Δε1 0.00810461 0.6961663
Δε2 0.939816 0.4079426
Δε3 0.8974794
λ1(μm ) 1.1071 0.0684043
λ2(μm ) 1 0.1162414
λ3(μm ) 9.896161

Figure 2.1: Sellmeier coefficients for both silicon [46] and silica [107]. For silica for conventional Sellmeier formula (equation 2.11) is used, whilst for silicon a modified formula (equation 2.12) is used.

We can therefore write the (low power) dispersion relation as

D⃗= ε ε(ω)E⃗
     0
(2.13)

where the frequency dependent permittivity is given by equation 2.10. This approach of replacing the oscillators with a frequency dependent function is not, however, universally applicable. If we wish to directly solve Maxwell’s equations for a nonlinear system (rather than using the envelope approach described below), then the oscillators must be explicitly integrated [96].

Dispersion parameter

At this point we need to provide a rigorous definition of what we mean by group velocity dispersion. We start by noting that the phase velocity of light is given by vp = ω/k, where k is (by definition) the wavenumber. A pulse, however, will not travel at this speed, but at the group velocity vg = dω/dk. We can use the group velocity to define a group refractive index ng c/vg = n λ(dn/dλ). The gradient of group index is an obvious choice for specifying GVD, and so we define the dispersion parameter D as [8]

D (λ) ≡ 1 dng
       c dλ
(2.14)

This is constructed so that a pulse with spectral width Δλ (about a central frequency λ) will temporally broaden by ΔλD(λ) per unit propagation length. The GVD is conventionally measured in units of psnm1 km1, which is a consequence of its use in fibre optic communication, where the pulse duration is typically measured in picoseconds, the wavelength in nanometres and the distance in kilometers. The dispersion parameter can also be written in terms of (ordinary) refractive index n as

      λd2n
D ≡ – cdλ2
(2.15)

A positive value of D corresponds to anomalous GVD in which blue pulses move faster than red pulses, whilst a negative value corresponds to normal GVD, in which the converse is true.

Material dispersion of silicon and silica

Dispersion parameters calculated from the coefficients in figure 2.1 are shown in figure 2.2. In the range of interest (around 1.5μm), the GVD of silicon is strongly normal, whilst the much weaker GVD of silica is anomalous.



Figure 2.2: Dispersion parameters of silicon and silica, calculated from [46] and [107]. Positive values correspond to anomalous GVD, and negative values to normal GVD.

2.1.2 Waveguide dispersion

A waveguide introduces a dispersion of its own, due to the dielectric effects at material boundaries. This can be used to achieve strong anomalous GVD, despite the strongly normal material GVD of silicon [4748]. Assuming monochromatic light and a lossless waveguide mode, we can write the electric field ⃗E as

⃗   ⃗               (             )
E = Fω,ε(⃗x⊥,ω)(⃗x⊥) cosβ ω,ε(⃗x⊥,ω)z – ωt
(2.16)

where ⃗F is the modal profile (as a function of position on the transverse plane ⃗x), and β is its corresponding propagation constant. Both F⃗ and β depend on the frequency ω, and the cross-sectional dielectric profile of the waveguide ε(⃗x ⊥,ω). This profile varies with position over the transverse plane (depending on what material is present at that point in space) and is also dependent on frequency, as specified by equation 2.10 for the particular material. (This approach is not universal, as some sources recommend finding the modes in ⃗H rather than ⃗E for mathematical convenience [108].)

At given ω, values of ⃗F and corresponding β satisfying Maxwell’s equations can be found using readily available software packages. The FemSIM module of the RSoft package [109] was used. For the configurations used, the principal mode was a quasi-TE0 mode. (TE stands for transverse electric, meaning that the electric field has no component along the direction of propagation. By quasi-TE, we mean a mode in which the electric field component along the direction of propagation is small but non-zero. The subscript zero denotes the lack of any nodes within the mode.) A quasi-TM0 (transverse magnetic) mode also exists, but it is less confining, and is more difficult to excite. The two modes have different values of β, and so any coupling between them will not be phase matched, making the interaction between them negligible. The coupling is reduced further still by the fact that the modes are roughly orthogonal and so will have minimal projection on to one another. Due to its tighter confinement (and hence higher nonlinearity) we will concentrate on the quasi-TE0 mode throughout this report.

A variety of SOI waveguides are considered in this report, with dispersions tailored for a large range of applications. Their dispersions are given in figure 2.3. An example of the mode profiles used to calculate these dispersion relations is given in figure 2.4.



Figure 2.3: A selection of dispersion curves for silicon on insulator waveguides with rectangular cross-section. Dimensions are given in nanometres. Positive values correspond to anomalous GVD, whilst negative values correspond to normal GVD.



Figure 2.4: Principal quasi-TE0 mode of a 220nm × 380nm SOI waveguide. Shown over a 1μm × 1μm cross section. The electric field vector is split into Cartesian components, with transverse components parallel to the silica-air interface shown top, transverse perpendicular components shown middle, and longitudinal components shown bottom. The horizontal lines denote the silica-air interface, whilst the rectangles show the cross section of the silicon wire. It has been assumed that the wire lies 20nm above the silica-air plane due to over-etching, and that the wire is covered with a 100nm thick etching mask of refractive index 1.35. Colour saturation gives absolute value. The + and signs denote relative phase.

Extracting dispersion coefficients

Values of β at various ω were fitted to a polynomial

        β1          β2        2  β3        3
β = β0 + 1! (ω – ω0)+ 2! (ω – ω0) + 3! (ω – ω0) + ⋅⋅⋅
(2.17)

where βm are the dispersion coefficients. The core frequency ω0 is chosen to be the pump frequency. This is the principal frequency of the light fired into the waveguide. Through most of this report we use a pump wavelength of λ0 = 1.5 μm, giving a pump frequency of ω0/2π = 1256 THz.

An important issue with polynomial fitting is the order of polynomial used. Whilst a sufficiently high order is required to accurately fit the underlying function, the use of too high an order (in which the number of polynomial coefficients starts to approach the number of datapoints) will cause pathological behaviour in which the datapoints themselves are well fitted, but the curve of the polynomial lurches wildly between them. (In the extreme case of an Nth order polynomial being used to interpolate N + 1 datapoints, this is known as Runge’s phenomenon [110].) This problem was avoided by calculating a large number of data points (typically between 60 to 100). Using an 11th order polynomial gave fitting errors in the region of 1 part in 105, whilst maintaining a high ratio between the number of data points and the number of free parameters in the polynomial. Even so, it is possible that slight fitting pathologies have crept into the analysis. In figure 2.3, for example, the plots for the 220nm × 380nm and 220nm × 420nm wires show an unusually flat region between about 1.25μm and 1.35μm. This feature, however, is relatively slight, and is unlikely to affect further analysis.

The constant parameter β0 is an unimportant phase-factor. The first-order parameter β1 is the reciprocal of the group velocity. Again, this is of relatively little importance, as it merely describes the time taken for a pulse to leave a waveguide, rather than how it evolves whilst in the waveguide. Indeed, in the nonlinear Schrödinger formulation derived in section 2.2.2, β1 is explicitly removed, by switching to a moving frame of reference. (It should be noted, however, that these coefficients cannot be neglected for the coupled waveguides considered in section 4.1.)

The most important coefficient is β2, which gives the group velocity dispersion at the pump frequency. This is related to the pump-wavelength dispersion parameter by

D (λ0) = – 2πc2β2
          λ0
(2.18)

where the change in sign means that positive values will give normal GVD, and negative values will give anomalous GVD.

The coefficients β3 and above comprise what is known as the higher order dispersion (HOD). These provide corrections to the parabolic dispersion model that results from considering β2 only. In many systems, the impact of HOD is negligible, except at wavelengths very close to the zero-dispersion wavelengths [8]. For SOI however, HOD is very important [5799100101], and so it must be considered fully.

2.2 Optical nonlinearity

By optical nonlinearity, what we specifically mean is that the dielectric polarisation induced by an electric field is not directly proportional to that field. The general form of the polarisation (up to third order) is given by [97]

           ∫ ∞
Pj  =    ε0    ˆχ(1)(t– τ1)Ek (τ1)dτ1                                (2.19)
           ∫–∞  jk
             ∞  (2)
       + ε0 –∞ ˆχjkl(t – τ1,t– τ2)Ek(τ1)El(τ2)dτ1dτ2
           ∫ ∞
       + ε0    ˆχ(j3k)lm (t– τ1,t– τ2,t– τ3)Ek(τ1)El(τ2) Em (τ3)dτ1dτ2dτ3
            –∞
(which is written using Einstein notation, so that index duplication implies summation). The first of these terms corresponds to the linear dispersion considered above, the second term corresponds to quadratic nonlinearity (the Pockel’s effect), and the third term corresponds to the Kerr effect.

This equation is extremely unwieldy, as the second-order tensor ˆχ(1) contains 9 functions, the third-order tensor ˆχ(2) contains 27 functions and the fourth order tensor ˆχ(3) contains 81 functions. However, we assume silicon to be isotropic, which reduces equation 2.19 to

           ∫ ∞
Pj  =    ε0    ˆχ(01)(t– τ1)Ej(τ1)dτ1                               (2.20)
           ∫–∞
             ∞  (3)
       + ε0 –∞ ˆχ0 (t– τ1,t– τ2,t– τ3)Ek (τ1)Ek (τ2)Ej (τ3)dτ1dτ2dτ3
where ˆχ0(1) and ˆχ0(3) are scalars, and the inherently anisotropic quadratic term has vanished completely. Secondly, we note that the nonlinearity of silicon is ultrafast [58] in that ⃗P responds effectively instantaneously to ⃗E. Therefore, the cubic susceptibility becomes
ˆχ(03)= χ(03)δ(t– τ1)δ(t– τ2) δ(t– τ3)
(2.21)

where χ0(3) is a scalar constant which defines the strength of the nonlinearity. The χˆ0(1) term remains time-dependent, but this simply corresponds to linear dispersion considered in section 2.1.1.

Therefore, in bulk silicon, we have a displacement field of the form

                   (     )
D⃗= ε0ε(ω)E⃗+ ε0χ(30) E⃗⋅E⃗ E⃗
(2.22)

In silicon (and in the majority of materials), χ0(3) is positive, meaning that refractive index increases with intensity. This is referred to as a focusing nonlinearity, as it enables the self-focusing effects that give spatial solitons.

We can generalise this to a waveguide by making the effective index approximation. We assume that the nonlinear propagation through a waveguide is the same as plane wave propagation through a bulk material with the same dispersion relation. We therefore rewrite equation 2.22 as

                   (3)(     )
⃗D = ε0n2eff (ω)E⃗+ ε0χ0 E⃗⋅E⃗ E⃗
(2.23)

where the effective refractive index neff is defined in terms of the modal propagation constant β as neff(ω) = βc/ω.

2.2.1 Nonlinear continuous wave propagation

Before proceeding with a full dispersive treatment of nonlinear propagation, it is instructive to consider the propagation of continuous wave (CW) radiation through a nonlinear waveguide. We assume a sinusoidal wave of the form

         (      )
E⃗= a cos ˜βz – ωt ⃗F'
(2.24)

where a is an amplitude parameter, which we will shortly relate to the optical power. We have also replaced the linear propagation constant β with a modified quantity ˜β , which is to be determined. The field profile ⃗F' is normalised as

 ⃗'  ------⃗F------
F  = √ ∫-∫-⃗-2----
          |F|dxdy
(2.25)

which is done so that | ⃗
F'|2dxdy = 1, thus allowing a to uniquely control the optical power. Substituting this into equation 2.23 gives a displacement field of the form

⃗D   =  ε0n2eff (ω)a cos(kz – ωt) ⃗F'                        (2.26)
             (3)               (     )
       + ε03χ0--|a|2acos(kz – ωt) ⃗F ' ⋅F ⃗' ⃗F'
             4(3)
       + ε χ-0-|a|2acos(3kz – 3ωt)(⃗F '⋅F⃗') ⃗F'
          0 4
where we have used the identity cos3(𝜃) 34 cos(𝜃) + 14 cos(3𝜃) to split the nonlinear term into its frequency components. The cos(3kz – 3ωt) harmonic is not particularly important, as it isn’t phase matched to the driving field, and will thus integrate to zero over successive wavecycles. Therefore, we can approximate this as zero.

Substituting the expressions for ⃗E and ⃗D (without the third-order harmonic term) into equation 2.5 gives

         2(               (3)     (     )   )
˜β2a⃗F' = ω-- n2eff (ω)aF⃗' + 3χ0-|a|2a ⃗F' ⋅ ⃗F ' ⃗F'
        c2               4
(2.27)

where we have assumed E⃗ to have zero divergence, and have removed the diffractive 2F⃗' term. We can eliminate the field profile from our equations by taking the scalar product with ⃗
F' and integrating over the transverse plane. This gives the dispersion relation

        ┌ ---(-----------------)
        ││  ω2          3χ (30)
˜β(ω,a) = √ c2- n2eff (ω)+-4S---|a|2
                          eff
(2.28)

where the effective area Seff is given by

      (∫ ∫        )2
          |⃗F|2dxdy
Seff = -∫-∫-|⃗F-|4dxdy--
(2.29)

If the intensity was uniform within the waveguide, and zero outside, then this quantity would correspond to the waveguide’s cross-sectional area. In reality, this is not the case, and so Seff has a value which is similar but not identical to the real area.

We can rewrite equation 2.28 as

         ˜         √-----------(3)---
˜n(ω,a) = β(ω,a)c-=  n2  (ω )+ 3χ0-|a|2
            ω         eff      4Seff
(2.30)

where ñ is a modified refractive index. (As expected, this expression reduces to neff in the lower-power limit.)

Intensity and nonlinear refractive index

The Poynting vector ⃗S is given by

                     (    )
⃗S = E⃗× H⃗= cε0neff (ω) ⃗E ⋅ ⃗E ˆz
(2.31)

where is the direction of propagation. The total power vector can then be found by integrating over the transverse plane to give

∫ ∫                        (            )
    ⃗Sdxdy = cε0neff (ω)a2cos2 ˜β (ω, a)z – ωt ˆz
(2.32)

Taking the mean of the component over time gives the power P as

P = cε0neff (ω)|a|2
        2
(2.33)

It is convenient to treat the amplitude as being the square root of the optical power. Therefore, we define a new measure of amplitude

       ------
     √ cε0n0
A ≡ a  --2--
(2.34)

which gives P = |A|2 as required. In order to remove the explicit frequency dependence, we have replaced neff(ω) with a typical value n0 which is representative of the frequency range of interest. Rewriting equation 2.30 using A gives

         √ -------------(3)------
˜n (ω,A ) =  n2 (ω)+  --3χ0----|A|2
            eff      2cε0n0Seff
(2.35)

The nonlinearity of silicon and silica are more conventionally described using their nonlinear refractive indices, rather than their χ0(3) values. The coefficient n2 is defined as that fitting the approximation

˜n(ω,I) = n(ω)+ In
                 2
(2.36)

where the intensity I is given by P/Seff. For silicon, this constant has a value between about 1018 m2 W1 and 1017 m2 W1 [5556], whilst for silica it is much smaller at about 3 × 1020 m2 W1 [111]. We can relate χ0(3) with n2 by assuming a weak nonlinearity and thus approximating equation 2.35 with a first-order binomial expansion

                     (3)    2
˜n (ω,A) = neff (ω )+-3χ0-2|A|-
                  4ε0c n0Seff
(2.37)

where in accordance with the effective index approximation, we have replaced the bulk index n with the effective index. This conforms to equation 2.36, with n2 given by

    -3χ(03)
n2 = 4ε0cn20
(2.38)

Using this conversion, we can rewrite equation 2.35 as

           ------------------
         √  2      2n0n2
˜n(ω,A ) =  neff (ω)+-Seff- |A |2
(2.39)

2.2.2 Derivation of the Nonlinear Schrödinger Equation

We are now ready to derive a dispersive and nonlinear model of waveguide propagation. We proceed by making A a function of time and position.

E⃗= F⃗'(⃗x)A(z,t)eiβ0z–iω0t
(2.40)

where the e0z0t factor represents a carrier wave with frequency ω0 and wavenumber β0. The variable A is now a slowly varying envelope, which controls amplitude (via its absolute value), phase (via its complex argument) and spectral composition relative to the carrier wave (via temporal variation of its complex argument). We will see shortly that this envelope formulation allows us to reduce the second-order Maxwell equations into a first-order equation (in z). In the frequency domain, the field is given by

⃗˜E = ⃗F' ˜A(z,ω – ω0 )eiβ0z
(2.41)

In section 2.2.1 we showed how an electric field with frequency ω propagates with a wavenumber  ˜
β . This implies a wave-equation of the form

       2  2
∇2⃗E = β˜-∂-⃗E-
      ω2 ∂t2
(2.42)

Taking the Fourier transform of equation 2.42 gives the Helmholtz equation

∇2 ˜⃗E = – ˜β2E˜⃗
(2.43)

We wish to remove diffraction, and so we remove the transverse Laplacian from equation 2.43 to give

∂2E˜⃗    ˜2 ˜⃗
 ∂z2 = –β E
(2.44)

Substituting equation 2.41 into equation 2.44 gives

∂2A˜      ∂A˜   2 ˜    ˜2 ˜
∂z2 + 2iβ0∂z – β0A = – β A
(2.45)

It follows from equation 2.39 that the wavenumber is given by

         2β0ω0n2
˜β2 = β2 +--S--c-|A|2
            eff
(2.46)

where the representative refractive index n0 has been specified to be that at the carrier frequency, so that n0 = β0c/ω0. Substituting this into equation 2.45 and rearranging gives

           2    (  2   2 )
∂A˜+ --1-∂-A˜+ i  β0 –-β- A˜= iω0n2|A|2 ˜A
∂z   2iβ0 ∂z2       2β0        Seffc
(2.47)

This can be rewritten as

                (       )
∂ ˜A-+-1--∂2A˜+ i  β20 –-β2 A˜= iγ|A |2A˜
∂z   2iβ0∂z2       2β0
(2.48)

where we have replaced the constants in the nonlinear term with a single coefficient given by

γ ≡ n2ω0/Seffc
(2.49)

Slowly varying envelope approximation

If we assume that the light is spectrally narrow (i.e. |ω ω0|≪ ω0) then it follows that the same applies to wavenumber (i.e. |β β0|≪ β0). This justifies the approximation β02 β2 2β0(β0 β), giving

  ˜        2 ˜
∂A-+ --1-∂-A- +i(β0 – β ) ˜A = iγ|A|2 ˜A
∂z   2iβ0 ∂z2
(2.50)

We are now readily to make our most important approximation: We drop the 2Ã/∂z2 term. The justification for this can be seen by taking the spatial Fourier transform of the derivative terms

-1--∂2 ˜A     (β-–-β0)2
2iβ0∂z2  ⇒      2iβ0  A                        (2.51)
      ˜
     ∂A- ⇒   – i(β – β0)A                      (2.52)
     ∂z
The coefficient in front of the second derivative is extremely small (being a small value squared), and so the whole term can be removed to give
  ˜
∂A-+ i(β0 – β) ˜A = iγ|A|2 ˜A
∂z
(2.53)

This simplification is known as the slowly varying envelope approximation. The envelope A is spectrally narrow (both in terms of frequency and wavenumber), and hence it is broad both temporally and spatially. Therefore, the approximation amounts to requiring that the envelope must vary over a time scale that is much longer than the time period, and over a length scale that is much longer than the wavelength.

By using this approximation, we have reduced a second-order equation (in distance) to a first-order equation, which is extremely useful for finding both analytical and numerical solutions.

We can then replace β with the Taylor expansion denoted by equation 2.17, and so specify the waveguide dispersion by using the coefficients described in section 2.1.2. This gives

      (                             )
∂A˜– i (ω – ω0)β1 + 1 (ω– ω0)2β2 + ⋅⋅⋅ ˜A = iγ|A |2 ˜A
∂z                 2
(2.54)

Finally, we switch back to the time domain (noting both that à is a function of ω ω0 and not just ω, and that |A|2 is a function of t and not ω), to give the Generalised Nonlinear Schrödinger Equation

      ∑M        m
∂A-– i    im βm-∂-mA = iγ|A|2A
∂z    m=2   m! ∂t
(2.55)

where M is the total number of βm coefficients used to fit the dispersion relation. We have removed the β1 term (the reciprocal of the group velocity) which amounts to switching to a moving frame of reference. The reason for this is simply because the group velocity is largely irrelevant. We are interested in how light interacts with light, and the velocity at which these interactions takes place is a needless distraction from the fundamental physics of the situation. (Additionally, the removal of group velocity can be useful when finding numerical solutions, as the entire simulation can be confined to a relatively small time window.)

In the simplest possible dispersive case, we can truncate the expansion of β to second order. This gives the Nonlinear Schrödinger Equation as

∂A      ∂2A
∂z-+ iβ2∂t2-= iγ|A |2A
(2.56)

2.2.3 Generalised nonlinearity

Equation 2.55 is in fact only one example of a Generalised Nonlinear Schrödinger Equation. We can in principle replace the cubic term with any nonlinear function of A.

If we consider a more general nonlinear polarisation ⃗PNL, such that the displacement field is given by

D⃗≡ ε ε(ω) ⃗E + P⃗
     0          NL
(2.57)

we can derive an equation of the form

∂A     M∑   mβm ∂mA       ω0
∂z-– i    i m!--∂tm- = i2ε0n0c-ΛNL
      m=2
(2.58)

where ⃗PNL has been represented by a slowly varying envelope ΛNL(z,t), such that

⃗     1 ⃗'(    iβ0z–iω0t    ∗  –iβ0z+iω0t)
PNL = 2F   ΛNLe        +Λ NLe
(2.59)

If there are multiple sources of nonlinearity, then we can generalise this to

          N
⃗     1 ⃗'∑  (     iβ0nz–iω0nt   ∗   – iβ0nz+iω0nt)
PNL = 2F  n=1 ΛNLne         + ΛNLne
(2.60)

where ΛNLn is the envelope for a nonlinear term with carrier frequency ω0n and wavenumber β0n. We will return to this formulation in chapter 6, where the evolution of solitons sustained by the Raman effect (rather than Kerr nonlinearity) is considered.

2.2.4 Elementary solutions and dimensionless units

As we are interested in finding solitons, it is convenient to switch to a system of dimensionless units that reflects this fact. For negative β2 (i.e. anomalous GVD) a soliton solution to equation 2.56 of the form

           √ ----   (      )     (  )
         1-  |β2|     |β2|        -t
A (z,t) = T0   γ  exp  i2T20 z sech  T0
(2.61)

exists, where T0 defines its temporal width. We will consider this solution in detail in section 3.1, but at present will merely use it as a basis for intrinsic units. The power profile of this solution is given by

                   (   )
      2   |β2|    2  t-
P = |A | = T20γ sech    T0
(2.62)

and so the peak power is given by

     |β2|-
P0 = T20γ
(2.63)

This is the soliton threshold power, above which solitons can (for a given pulse duration) form. It can be written as

P =  -1--
 0   LDγ
(2.64)

where we have defined the dispersion length to be LD T02/|β2|. Roughly speaking, this is the "typical" length scale over which dispersion becomes significant. A more specific definition is that a Gaussian pulse (of the form eτ2/2T 02 ) will increase in width by a factor of √ -
  2 after one dispersion length [8].

Dimensionless units

The NLS can be converted into dimensionless form by setting T0, P0 and LD to unity, which reduces the soliton solution to an unscaled sech function (multiplied by an appropriate phase factor). The conversion is achieved by making the substitutions

   t =   T0τ                             (2.65)

  z  =   LDζ                             (2.66)
|A |2  =   P0|E |2                           (2.67)
This gives
∂E    M∑   m–1   ∂mE       2
∂-ζ +    i   pm ∂τm-= i|E| E
     m=2
(2.68)

where higher order dispersion has been reintroduced, with the corresponding coefficients being

                 2–m
pm = βmLDm--= βmT-0---
     m!T0     m!|β2|
(2.69)

It follows from its definition that p2 = ±1
2, with the sign depending on whether the GVD is normal or anomalous.

These intrinsic units depend on the system, but it is instructive to consider typical values. The value of T0 is defined by the temporal power profile of the input pulse, which we assume to be from a mode-locked laser. These devices are widely used as sources of ultrashort pulses, including experiments with silicon (e.g. [577672]). Notably, their pulse shape is the same as the above soliton, having a power profile of the form P sech2 (t/T0) [112113]. It is conventional to define pulse width using full width at half maximum (FWHM), which is related to T0 by the relation TFWHM 2ln(   √ -)
 1+   2T0. Throughout most of this report we assume a 100fs FWHM, giving a value of T0 56.730fs.

The value of LD is dependent on the waveguide geometry and the operational wavelength. For the systems considered, it ranges from about 0.5mm to 2mm. It should be noted that these values are exceptionally small; in fibre optics, values of metres or even kilometres are more typical.

The value of P0 is usually of the order a few watts, which again is exceptionally low. It can in principal be derived from other physical parameters using equation 2.49, but this is problematic due to the accuracy with which these coefficients are known. In section 3.3.2 we analyse this problem in more detail, and attempt to derive a value from (third party) experimental data. This ambiguity in P0 is not particularly important from a mathematical point of view, as it merely corresponds to a scaling of the necessary input power.

Dispersion Operator

It is convenient to combine all the time derivative terms into a single operator. This enables us to write the NLS equation in the more compact form of

∂E-   ˆ       2
∂ζ – iDE  = i|E |E
(2.70)

where the operator ˆD is defined as

    ∑M    (  ∂ )m
ˆD ≡     pm  i∂τ-
    m=2
(2.71)

2.3 Optical loss

Optical loss can be accounted for by noting that—mathematically speaking—attenuation is the same as wave propagation, but with an imaginary propagation constant. For linear attenuation, we replace the linear propagation constant with a complex number of the form

        i
β → β + 2α
(2.72)

where α is the attenuation coefficient. (The factor of 1
2 results from the fact that we have specified the absorption in terms of amplitude, whereas the attenuation coefficient is conventionally defined in terms of power.)

This loss primarily results from leakage to the waveguides having rough edges, and from optical coupling to the substrate on which the waveguide is fabricated [114]. Material absorption has been tentatively suggested as an additional loss mechanism [115]. Absorption typically ranges from about 2 dBcm1[116] to 5 dBcm1 [63], but values as low as 1dBcm1 have been achieved [117118].

Making this replacement gives

∂E
∂ζ-– iˆDE  = i|E|2E – εlE
(2.73)

where we have represented αl in dimensionless units as εl LDαl/2.

2.3.1 Multi photon absorption

Another important effect is two photon absorption. This occurs when two or more photons combine to overcome silicon’s 1.1eV indirect bandgap, and generate free charge carriers [119]. (For silica, with its very large 9eV bangap [120], this effect is insignificant.) Just as linear dispersion and absorption are interlinked, so are their nonlinear counterparts. The optical nonlinearity is an inevitable consequence (again mathematically formalised by a Kramers Krönig transformation) of these nonlinear absorptions [121122].

Two photon absorption can be represented by changing the nonlinear refractive index as

n →  n + icα2pa-
 2    2   2ω0
(2.74)

where α2pa is the two-photon-absorption coefficient. This has a value between about 0.3 × 1011 mW1 and 1.1 × 1011 mW1 [5556].

Making this substitution yields

∂E-– iˆDE = i|E |2E – εlE – ε2pa|E |2E
∂ζ
(2.75)

where the dimensionless 2PA coefficient is given by ε2pa (α2paP0LD)/(2Seff). This coefficient is the ratio between the real and imaginary parts of the refractive index (such that the complex index is given by (1+ ε2pai)n2). It is this ratio (rather than the absolute values) which is of particular importance for soliton dynamics [123], and so it is useful to specify ε2pa directly, rather than deriving it from α2pa. In section 3.3.2 this coefficient is determined from (third-party) experimental data [54], yielding a value of ε2pa = 0.1. This is consistent with other estimates, which range from 0.1 [124] to 0.14 [64].

Three photon absorption

In section 3.3.2 we propose the existence of three photon absorption. This is known to exist at much longer wavelengths where the photon energy is less than half the 1.1 eV bandgap making 2PA impossible [117], but we propose that it is also present when 2PA is possible. This can be modelled by adding a quintic absorption term to 2.75, giving

∂E- – iDˆE  = i|E |2E – εlE – ε2pa|E|2E – ε3pa|E |4E
 ∂ζ
(2.76)

where ε3pa gives the magnitude of the absorption. In section 3.3.2 we estimate it to have a value of 0.05.

2.3.2 Free charge carrier interactions

Another important effect is the impact of the free charge carriers (FCCs) excited by the multiphoton absorption. These can both absorb photons [74], and change the refractive index of the material [125].

The absorption is proportional to the number of free carriers, and so can represented by extending the modification to β given by transformation 2.72 to

β → β +-iα+ -iσN
       2    2
(2.77)

where N is the volumetric density of free carriers, and σ is the cross section of the absorption. [78].

The change in refractive index Δn caused by an increase in carrier density ΔN can be expressed as

Δn = – NfckΔN
(2.78)

where the constant Nfc is defined by this relation, and has a value of approximately 1.35 × 1027 m3 [69126127]. In this expression, the free space wavenumber is given by k. If the FCC density starts from a low value, this can be approximated as Δn = NfcN.

We can directly subtract this from β, but it is more conventional [124] to instead add an imaginary component to the cross section, such that

σ  = σ  + iσ
 fc    abs    ind
(2.79)

where the real component σabs is the absorption, and the imaginary component σind = 2Nfck represents the index change. In the wavelength region of interest, σabs has a value of approximately 1.45 × 1021 m2, whilst σind is approximately 1.09 × 1020 m2, giving a combined value of σfc = 1.45 × 1021(1+ 7.5i) m2 [124]. Making the replacement gives

∂E- – ˆDE = i|E |2E – εlE – ε2pa|E |2E – ε3pa|E|4E – εfcE ν
 ∂ζ
(2.80)

where we have converted σfc into dimensionless units as εfc LDσfcN0/2. The scaled carrier density ν, and the conversion factor N0 are described below.

Evolution of free charge carriers

To fully account for free charge carrier effects, the NLS must be coupled to another equation describing the carrier evolution. The volumetric density of charge carriers is given by [78]

∂N-   -α2pa--  4   1-
∂t  = 2ℏω0S2eff|A| – tcN
(2.81)

where is the reduced Planck’s constant. The first term corresponds to the excitation of charge carriers by two-photon-absorption, whilst the second term corresponds to carrier decay, with a mean lifetime of tc. This can range from 10ns to 200ns [128], and is dependent on the waveguide geometry [129]. The exact value of the lifetime is not particularly important for the purpose of this report, as it is both much larger than the pulse durations used, and much smaller than the pulse repetition time (a value of 4μs is used in section 3.3). Therefore we can assume decay to be negligible within pulses, but total between pulses, removing the need for a precise figure.

The equation for free-charge-carrier evolution can be converted into the same system of units as that used for the NLS by making the further substitution N νN0, where N0 is a typical free-charge-carrier density, which is given by

N0 ≡ α2paT0P02
      2ℏω0S2eff
(2.82)

This gives

∂ν-= |E |4 – ν-
∂τ         τc
(2.83)

where the recombination time has been scaled as τc tc/T0. At a given point in distance, this equation is readily converted to integral form, giving

         ∫ τ
ν = e–τ/τc     |E|4eτ'/τcdτ' + ντ=0e–τ/τc
          τ'=0
(2.84)

where the integration constant ντ=0 gives the carrier density at τ = 0.

2.4 Summary of the model

The propagation of light through a silicon on insulator photonic wire is described by

|---------------------------------------------------|
|∂E-   ˆ        2              2          4         |
|∂ζ – iDE  = i|E| E – εlE – ε2pa|E |E – ε3pa|E| E – εfcEν|
----------------------------------------------------
(2.85)

where the dispersion operator ˆD is defined as

    ∑M    (  ∂ )m
ˆD ≡     pm  i---
    m=2      ∂τ
(2.86)

and where the evolution of the free charge carriers is described by

∂ν-= |E |4 – ν-
∂τ         τc
(2.87)

Using the model

We now have a model capable of describing the evolution of SOI waveguides. The next chapter is devoted to finding solutions of equation 2.85, and to understanding the physical phenomena that they represent.