*
See a typo? Have a suggestion?
Edit this page on Github
*

This blog post addresses a long-standing FIXME in the conduit-combinators documentation, as well as a question on Twitter. This blog post will assume familiarity with the Conduit streaming data library; if you'd like to read up on it first, please check out the tutorial. The full executable snippet is at the end of this blog post, but we'll build up intermediate bits along the way. First, the Stack script header, import statement, and some minor helper functions.

```
#!/usr/bin/env stack
--stack --resolver lts-8.12 script
import Conduit
src10 :: Monad m => ConduitM i Int m ()
src10 = yieldMany [1..10]
remaining :: MonadIO m => ConduitM i o m ()
remaining = lengthC >>= \x -> liftIO (putStrLn ("Remaining: " ++ show x))
```

`src10`

just provides the numbers 1 through 10 as a source, and
`remaining`

tells you how many values are remaining from
upstream. Cool.

Now let's pretend that the Conduit libraries completely forgot to
provide a `drop`

function. That is, a function that will take an `Int`

and discard that many values from the upstream. We could write one
ourselves pretty easily:

```
dropSink :: Monad m => Int -> ConduitM i o m ()
dropSink cnt
| cnt <= 0 = return ()
| otherwise = await >> dropSink (cnt - 1)
```

(Bonus points to readers: this function is inefficient in the case
that upstream has less than `cnt`

values, optimize it.)

This function will drop a certain number of elements from upstream, so the next component we monadically bind with can pick it up. Let's see how that looks:

```
goodDropSink :: IO ()
goodDropSink = runConduit
$ src10
.| (dropSink 5 >> remaining)
```

All well and good. But notice two things:

- I called this
`dropSink`

. Why sink? - I stressed that we had to monadically bind. Why?

Well, there's another formulation of this drop function. Instead of
letting the next monadically bound component pick up remaining values,
we could *pass the remaining values downstream*. Fortunately it's
really easy to implement this function in terms of `dropSink`

:

```
dropTrans :: Monad m => Int -> ConduitM i i m ()
dropTrans cnt = dropSink cnt >> mapC id
```

(For more meaningless bonus points, feel free to implement this
without `dropSink`

, or for a greater challenge, implement `dropSink`

in terms of `dropTrans`

.) Anyway, this function can be used easily as:

```
goodDropTrans :: IO ()
goodDropTrans = runConduit
$ src10
.| dropTrans 5
.| remaining
```

Many may argue that this is more natural. To some extent, it mirrors
the behavior of `take`

more closely, as `take`

passes the initial
values downstream. On the other hand, `dropTrans`

cannot *guarantee*
that the values will be removed from the stream; if instead of
`dropTrans 5 .| remaining`

I simply did `dropTrans 5 .| return ()`

,
then the `dropTrans`

would never have a chance to fire, since
execution is driven from downstream. Also, as demonstrated, it's
really easy to capture this transformer behavior from the sink
behavior; the other way is trickier.

My point here is that we have two legitimate definitions of a
function. And from my experience, different people expect different
behavior for the function. In fact, some people (myself included)
intuitively expect different behavior *depending on the circumstance*!
This is what earns `drop`

the title of worst function in conduit.

To make it even more clear how bad this is, let's see how you can misuse these functions unintentionally.

```
badDropSink :: IO ()
badDropSink = runConduit
$ src10
.| dropSink 5
.| remaining
```

This code looks perfectly reasonable, and if we just replaced
`dropSink`

with `dropTrans`

, it would be correct. But instead of
saying, as expected, that we have 5 values remaining, this will
print 0. The reason: `src10`

yields 10 values to
`dropSink`

. `dropSink`

drops 5 of those and leaves the remaining 5
untouched. But `dropSink`

never itself yields a value downstream, so
`remaining`

receives nothing.

Because of the type system, it's slightly trickier to misuse
`dropTrans`

. Let's first do the naive thing of just assuming it's
`dropSink`

:

```
badDropTrans :: IO ()
badDropTrans = runConduit
$ src10
.| (dropTrans 5 >> remaining)
```

GHC does not like this one bit:

```
error:
• Couldn't match type ‘Int’ with ‘Data.Void.Void’
Expected type: ConduitM () Data.Void.Void IO ()
Actual type: ConduitM () Int IO ()
```

The problem is that `runConduit`

expects a pipeline where the final
output value is `Void`

. However, `dropTrans`

has an output value of
type `Int`

. And if it's yielding `Int`

s, so must `remaining`

. This is
definitely an argument in favor of `dropTrans`

being the better
function: the type system helps us a bit. (It's also an argument in
favor of keeping
the type signature of `runConduit`

as-is.)

However, it's still possible to accidentally screw things up in bigger pipelines, e.g.:

```
badDropTrans :: IO ()
badDropTrans = runConduit
$ src10
.| (dropTrans 5 >> remaining)
.| (sinkList >>= liftIO . print)
```

This code may look a bit contrived, but in real-world Conduit code it's not at all uncommon to deeply nest these components in such a way that the error would not be present. You may be surprised to hear that the output of this program is:

```
Remaining: 0
[6,7,8,9,10]
```

The reason is that the `sinkList`

is downstream from `dropTrans`

, and
grabs all of its output. `dropTrans`

itself will drain all output from
`src10`

, leaving nothing behind for `remaining`

to grab.

The Conduit libraries use the `dropSink`

variety of function. I wish
there was a better approach here that felt more intuitive to
everyone. The closest I can think of to that is deprecating `drop`

and
replacing it with more explicitly named `dropSink`

and `dropTrans`

,
but I'm not sure how I feel about that (feedback welcome, and other
ideas *certainly* welcome).

Full code

```
#!/usr/bin/env stack
--stack --resolver lts-8.12 script
import Conduit
dropSink :: Monad m => Int -> ConduitM i o m ()
dropSink cnt
| cnt <= 0 = return ()
| otherwise = await >> dropSink (cnt - 1)
dropTrans :: Monad m => Int -> ConduitM i i m ()
dropTrans cnt = dropSink cnt >> mapC id
src10 :: Monad m => ConduitM i Int m ()
src10 = yieldMany [1..10]
remaining :: MonadIO m => ConduitM i o m ()
remaining = lengthC >>= \x -> liftIO (putStrLn ("Remaining: " ++ show x))
goodDropSink :: IO ()
goodDropSink = runConduit
$ src10
.| (dropSink 5 >> remaining)
badDropSink :: IO ()
badDropSink = runConduit
$ src10
.| dropSink 5
.| remaining
goodDropTrans :: IO ()
goodDropTrans = runConduit
$ src10
.| dropTrans 5
.| remaining
badDropTrans :: IO ()
badDropTrans = runConduit
$ src10
.| (dropTrans 5 >> remaining)
.| (sinkList >>= liftIO . print)
main :: IO ()
main = do
putStrLn "Good drop sink"
goodDropSink
putStrLn "Bad drop sink"
badDropSink
putStrLn "Good drop trans"
goodDropTrans
putStrLn "Bad drop trans"
badDropTrans
```

Full output

```
Good drop sink
Remaining: 5
Bad drop sink
Remaining: 0
Good drop trans
Remaining: 5
Bad drop trans
Remaining: 0
[6,7,8,9,10]
```