test code is code but it's different

Here's a bit of code...
def to_roman(n)
  result = ''
  ['V',5,'IV',4,'I',1].each_slice(2) do |ch,unit|
    while n >= unit do
      result += ch
      n -= unit
    end  
  end  
  result
end

And here's a bit of test code...
def test_arabic_integer_to_roman_string
  assert_equal    "I", to_roman(1)
  assert_equal   "II", to_roman(2)
  assert_equal  "III", to_roman(3)
  assert_equal   "IV", to_roman(4)
  assert_equal    "V", to_roman(5)
  assert_equal   "VI", to_roman(6)
  assert_equal  "VII", to_roman(7)
  assert_equal "VIII", to_roman(8)
end

I'm sometimes asked which is more important, the code or the tests? Does it matter if you refactor the code but let debt accumulate in the tests? A question like this tells me that the questioner probably doesn't really understand that test-code is the yin to the yang of the code it tests and vice versa. That they form a co-evolving system. When you feel the tests need refactoring that isn't a sign you did something wrong. That's just part of development! The tests mean that code under test is not as closed a system as it would be without them. And being less closed it can stave off entropy for longer.

That's not to say that code and tests are the same. They're both code, but they're not the same.

For example, suppose I have a metric that calculates a measure of the complexity of code. I use this metric to calculate the complexity of the code under test and also the complexity of its tests. Imagine if the complexity of my tests is greater than the complexity of the code it tests. How do I know the tests aren't wrong? Should I write some tests for the tests? And if the pattern repeats and the complexity of the tests for the tests is higher still should I write some tests for the tests for the tests? No. That way leads nowhere. I don't want complexity in my tests. It want them simple. I want them linear. The complexity of my tests might be greater than one, but it must be smaller than the complexity of what it's testing.
Test code is code but it's different.

Or consider the names of functions. In code under test I want Goldilocks names - names that are not too long, and not too short. On the one hand I want reasonably long names - long enough to express intention. On the other hand, names always occur as sub-expressions of larger expressions. If my names are too long, the full expressions they're part of quickly become unwieldy. It's in the full expressions I really want understandability, so I can turn the name-length-dial down a bit.
But the names of my test functions are different. They are never part of larger expressions. So for them I can turn the name-length-dial up up up - all the way to, deep breath, no_newline_at_end_of_file_msg_is_gobbled_when_at_end_of_common_section
Test code is code but it's different.